Methods for determining copy number variations

ABSTRACT

The invention provides a method for determining copy number variations (CNV) of a sequence of interest in a test sample that comprises a mixture of nucleic acids that are known or are suspected to differ in the amount of one or more sequence of interest. The method comprises a statistical approach that accounts for accrued variability stemming from process-related, interchromosomal and inter-sequencing variability. The method is applicable to determining CNV of any fetal aneuploidy, and CNVs known or suspected to be associated with a variety of medical conditions.

CROSS-REFERENCE

This Application claims priority to U.S. Provisional Application Ser.No. 61/296,358 entitled “Methods for Determining Fraction of FetalNucleic Acids in Maternal Samples”, filed on Jan. 19, 2010; U.S.Provisional Application Ser. No. 61/360,837 entitled “Methods forDetermining Fraction of Fetal Nucleic Acids in Maternal Samples”, filedon Jul. 1, 2010; U.S. Provisional Application Ser. No. 61/407,017entitled “Method for Determining Copy Number Variations”, filed on Oct.26, 2010; and U.S. Provisional Application Ser. No. 61/455,849 entitled“Simultaneous determination of Aneuploidy and Fetal Fraction”, filed onOct. 26, 2010; which are incorporated herein by reference in theirentirety.

1. FIELD OF THE INVENTION

The invention relates generally to the field of diagnostics, andprovides a method for determining variations in the amount of nucleicacid sequences in a mixture of nucleic acids derived from differentgenomes. In particular, the method is applicable to the practice ofnoninvasive prenatal diagnostics, and to the diagnosis and monitoring ofmetastatic progression in cancer patients.

2. BACKGROUND OF THE INVENTION

One of the critical endeavors in human medical research is the discoveryof genetic abnormalities that are central to adverse healthconsequences. In many cases, specific genes and/or critical diagnosticmarkers have been identified in portions of the genome that are presentat abnormal copy numbers. For example, in prenatal diagnosis, extra ormissing copies of whole chromosomes are the frequently occurring geneticlesions. In cancer, deletion or multiplication of copies of wholechromosomes or chromosomal segments, and higher level amplifications ofspecific regions of the genome, are common occurrences.

Most information about copy number variation has been provided bycytogenetic resolution that has permitted recognition of structuralabnormalities. Conventional procedures for genetic screening andbiological dosimetry have utilized invasive procedures e.g.amniocentesis, to obtain cells for the analysis of karyotypes.Recognizing the need for more rapid testing methods that do not requirecell culture, fluorescence in situ hybridization (FISH), quantitativefluorescence PCR (QF-PCR) and array-Comparative Genomic Hybridization(array-CGH) have been developed as molecular-cytogenetic methods for theanalysis of copy number variations.

The advent of technologies that allow for sequencing entire genomes inrelatively short time, and the discovery of circulating cell-free DNA(cfDNA) have provided the opportunity to compare genetic materialoriginating from one chromosome to be compared to that of anotherwithout the risks associated with invasive sampling methods. However,the limitations of the existing methods, which include insufficientsensitivity stemming from the limited levels of cfDNA, and thesequencing bias of the technology stemming from the inherent nature ofgenomic information, underlie the continuing need for noninvasivemethods that would provide any or all of the specificity, sensitivity,and applicability, to reliably diagnose copy number changes in a varietyof clinical settings.

The present invention fulfills some of the above needs and in particularoffers an advantage in providing a reliable method that is applicable atleast to the practice of noninvasive prenatal diagnostics, and to thediagnosis and monitoring of metastatic progression in cancer patients.

3. SUMMARY OF THE INVENTION

The invention provides a method for determining copy number variations(CNV) of a sequence of interest in a test sample that comprises amixture of nucleic acids that are known or are suspected to differ inthe amount of one or more sequences of interest. The method comprises astatistical approach that accounts for accrued variability stemming fromprocess-related, interchromosomal, and inter-sequencing variability. Themethod is applicable to determining CNV of any fetal aneuploidy, andCNVs known or suspected to be associated with a variety of medicalconditions.

In one embodiment, the invention provides a method for identifying fetaltrisomy 21, said method comprising the steps: (a) obtaining sequenceinformation for a plurality of fetal and maternal nucleic acid moleculesof a maternal blood sample e.g. a plasma sample; (b) using the sequenceinformation to identify a number of mapped sequence tags for chromosome21; (c) using the sequence information to identify a number of mappedsequence tags for at least one normalizing chromosome; (d) using thenumber of mapped sequence tags identified for chromosome 21 in step (b)and the number of mapped sequence tags identified for the at least onenormalizing chromosome in step (c) to calculate a chromosome dose forchromosome 21; and (e) comparing said chromosome dose to at least onethreshold value, and thereby identifying the presence or absence offetal trisomy 21. In one embodiment, step (d) comprises calculating achromosome dose for chromosome 21 as the ratio of the number of mappedsequence tags identified for chromosome 21 and the number of mappedsequence tags identified for the at least one normalizing chromosome.Alternatively, step (d) comprises (i) calculating a sequence tag densityratio for chromosome 21, by relating the number of mapped sequence tagsidentified for chromosome 21 in step (b) to the length of chromosome 21;(ii) calculating a sequence tag density ratio for said at least onenormalizing chromosome, by relating the number of mapped sequence tagsidentified for said at least one normalizing chromosome in step (c) tothe length of said at least one normalizing chromosome; and (iii) usingthe sequence tag density ratios calculated in steps (i) and (ii) tocalculate a chromosome dose for chromosome 21, wherein the chromosomedose is calculated as the ratio of the sequence tag density ratio forchromosome 21 and the sequence tag density ratio for said at least onenormalizing chromosome. The at least one normalizing chromosome is achromosome having the smallest variability and/or the greatestdifferentiability. The at least one normalizing chromosome is selectedfrom chromosome 9, chromosome 1, chromosome 2, chromosome 3, chromosome4, chromosome 5, chromosome 6, chromosome 7, chromosome 8, chromosome10, chromosome 11, chromosome 12, chromosome 13, chromosome 14,chromosome 15, chromosome 16, and chromosome 17. Preferably, thenormalizing sequence for chromosome 21 is selected from chromosome 9,chromosome 1, chromosome 2, chromosome 11, chromosome 12, and chromosome14. Alternatively, the normalizing sequence for chromosome 21 is a groupof chromosomes selected from chromosome 9, chromosome 1, chromosome 2,chromosome 3, chromosome 4, chromosome 5, chromosome 6, chromosome 7,chromosome 8, chromosome 10, chromosome 11, chromosome 12, chromosome13, chromosome 14, chromosome 15, chromosome 16, and chromosome 17.Preferably, the group of chromosomes is a group selected from chromosome9, chromosome 1, chromosome 2, chromosome 11, chromosome 12, andchromosome 14.

In one embodiment, the fetal and maternal nucleic acid molecules arecell-free DNA molecules. In some embodiments, the sequencing method foridentifying the fetal trisomy 21 is a next generation sequencing method.In some embodiments, the sequencing method is a massively parallelsequencing method that uses sequencing-by-synthesis with reversible dyeterminators. In other embodiments, the sequencing method issequencing-by-ligation. In some embodiments, sequencing comprises anamplification. In other embodiments, sequencing is single moleculesequencing.

In another embodiment, the invention provides a method for identifyingfetal trisomy 21 in a maternal blood sample e.g. a plasma samplecomprising fetal and maternal nucleic acid molecules, and comprises thesteps: (a) sequencing at least a portion of said nucleic acid molecules,thereby obtaining sequence information for a plurality of fetal andmaternal nucleic acid molecules of a maternal plasma sample; (b) usingthe sequence information to identify a number of mapped sequence tagsfor chromosome 21; (c) using the sequence information to identify anumber of mapped sequence tags for at least one normalizing chromosome;(d) using the number of mapped sequence tags identified for chromosome21 in step (b) and the number of mapped sequence tags identified for theat least one normalizing chromosome in step (c) to calculate achromosome dose for chromosome 21; and (e) comparing said chromosomedose to at least one threshold value, and thereby identifying thepresence or absence of fetal trisomy 21. In one embodiment, step (d)comprises calculating a chromosome dose for chromosome 21 as the ratioof the number of mapped sequence tags identified for chromosome 21 andthe number of mapped sequence tags identified for the at least onenormalizing chromosome. Alternatively, step (d) comprises (i)calculating a sequence tag density ratio for chromosome 21, by relatingthe number of mapped sequence tags identified for chromosome 21 in step(b) to the length of chromosome 21; (ii) calculating a sequence tagdensity ratio for said at least one normalizing chromosome, by relatingthe number of mapped sequence tags identified for said at least onenormalizing chromosome in step (c) to the length of said at least onenormalizing chromosome; and (iii) using the sequence tag density ratioscalculated in steps (i) and (ii) to calculate a chromosome dose forchromosome 21, wherein the chromosome dose is calculated as the ratio ofthe sequence tag density ratio for chromosome 21 and the sequence tagdensity ratio for said at least one normalizing chromosome. The at leastone normalizing chromosome is a chromosome having the smallestvariability and/or the greatest differentiability. The at least onenormalizing chromosome is selected from chromosome 9, chromosome 1,chromosome 2, chromosome 3, chromosome 4, chromosome 5, chromosome 6,chromosome 7, chromosome 8, chromosome 10, chromosome 11, chromosome 12,chromosome 13, chromosome 14, chromosome 15, chromosome 16, andchromosome 17. Preferably, the normalizing sequence for chromosome 21 isselected from chromosome 9, chromosome 1, chromosome 2, chromosome 11,chromosome 12, and chromosome 14. Alternatively, the normalizingsequence for chromosome 21 is a group of chromosomes selected fromchromosome 9, chromosome 1, chromosome 2, chromosome 3, chromosome 4,chromosome 5, chromosome 6, chromosome 7, chromosome 8, chromosome 10,chromosome 11, chromosome 12, chromosome 13, chromosome 14, chromosome15, chromosome 16, and chromosome 17. Preferably, the group ofchromosomes is a group selected from chromosome 9, chromosome 1,chromosome 2, chromosome 11, chromosome 12, and chromosome 14.

In one embodiment, the fetal and maternal nucleic acid molecules arecell-free DNA molecules. In some embodiments, the maternal blood sampleis a plasma sample. In some embodiments, the sequencing method foridentifying the fetal trisomy 21 is a next generation sequencing method.In some embodiments, the sequencing method is a massively parallelsequencing method that uses sequencing-by-synthesis with reversible dyeterminators. In other embodiments, the sequencing method issequencing-by-ligation. In some embodiments, sequencing comprises anamplification. In other embodiments, sequencing is single moleculesequencing.

In one embodiment, the invention provides a method for identifying fetaltrisomy 18, said method comprising the steps: (a) obtaining sequenceinformation for a plurality of fetal and maternal nucleic acid moleculesof a maternal blood sample e.g. a plasma sample; (b) using the sequenceinformation to identify a number of mapped sequence tags for chromosome18; (c) using the sequence information to identify a number of mappedsequence tags for at least one normalizing chromosome; (d) using thenumber of mapped sequence tags identified for chromosome 18 in step (b)and the number of mapped sequence tags identified for the at least onenormalizing chromosome in step (c) to calculate a chromosome dose forchromosome 18; and (e) comparing said chromosome dose to at least onethreshold value, and thereby identifying the presence or absence offetal trisomy 18. In one embodiment, step (d) comprises calculating achromosome dose for chromosome 18 as the ratio of the number of mappedsequence tags identified for chromosome 18 and the number of mappedsequence tags identified for the at least one normalizing chromosome.Alternatively, step (d) comprises (i) calculating a sequence tag densityratio for chromosome 18, by relating the number of mapped sequence tagsidentified for chromosome 18 in step (b) to the length of chromosome 18;(ii) calculating a sequence tag density ratio for said at least onenormalizing chromosome, by relating the number of mapped sequence tagsidentified for said at least one normalizing chromosome in step (c) tothe length of said at least one normalizing chromosome; and (iii) usingthe sequence tag density ratios calculated in steps (i) and (ii) tocalculate a chromosome dose for chromosome 18, wherein the chromosomedose is calculated as the ratio of the sequence tag density ratio forchromosome 18 and the sequence tag density ratio for said at least onenormalizing chromosome. The at least one normalizing chromosome is achromosome having the smallest variability and/or the greatestdifferentiability. The at least one normalizing chromosome is selectedfrom chromosome 8, chromosome 2, chromosome 3, chromosome 4, chromosome5, chromosome 6, chromosome 7, chromosome 9, chromosome 10, chromosome11, chromosome 12, chromosome 13, and chromosome 14. Preferably, thenormalizing sequence for chromosome 18 is selected from chromosome 8,chromosome 2, chromosome 3, chromosome 5, chromosome 6, chromosome 12,and chromosome 14. Alternatively, the normalizing sequence forchromosome 18 is a group of chromosomes selected from chromosome 8,chromosome 2, chromosome 3, chromosome 4, chromosome 5, chromosome 6,chromosome 7, chromosome 9, chromosome 10, chromosome 11, chromosome 12,chromosome 13, and chromosome 14. Preferably, the group of chromosomesis a group selected from chromosome 8, chromosome 2, chromosome 3,chromosome 5, chromosome 6, chromosome 12, and chromosome 14.Preferably, the group of chromosomes is a group selected from chromosome8, chromosome 2, chromosome 3, chromosome 5, chromosome 6, chromosome12, and chromosome 14.

In one embodiment, the fetal and maternal nucleic acid molecules arecell-free DNA molecules. In some embodiments, the maternal blood sampleis a plasma sample. In some embodiments, the sequencing method foridentifying the fetal trisomy 18 is a next generation sequencing method.In some embodiments, the sequencing method is a massively parallelsequencing method that uses sequencing-by-synthesis with reversible dyeterminators. In other embodiments, the sequencing method issequencing-by-ligation. In some embodiments, sequencing comprises anamplification. In other embodiments, sequencing is single moleculesequencing.

In another embodiment, the invention provides a method for identifyingfetal trisomy 18 in a maternal blood sample e.g. a plasma samplecomprising fetal and maternal nucleic acid molecules, and comprises thesteps: (a) sequencing at least a portion of said nucleic acid molecules,thereby obtaining sequence information for a plurality of fetal andmaternal nucleic acid molecules of a maternal plasma sample; (b) usingthe sequence information to identify a number of mapped sequence tagsfor chromosome 18; (c) using the sequence information to identify anumber of mapped sequence tags for at least one normalizing chromosome;(d) using the number of mapped sequence tags identified for chromosome18 in step (b) and the number of mapped sequence tags identified for theat least one normalizing chromosome in step (c) to calculate achromosome dose for chromosome 18; and (e) comparing said chromosomedose to at least one threshold value, and thereby identifying thepresence or absence of fetal trisomy 18. In one embodiment, step (d)comprises calculating a chromosome dose for chromosome 18 as the ratioof the number of mapped sequence tags identified for chromosome 18 andthe number of mapped sequence tags identified for the at least onenormalizing chromosome. Alternatively, step (d) comprises (i)calculating a sequence tag density ratio for chromosome 18, by relatingthe number of mapped sequence tags identified for chromosome 18 in step(b) to the length of chromosome 18; (ii) calculating a sequence tagdensity ratio for said at least one normalizing chromosome, by relatingthe number of mapped sequence tags identified for said at least onenormalizing chromosome in step (c) to the length of said at least onenormalizing chromosome; and (iii) using the sequence tag density ratioscalculated in steps (i) and (ii) to calculate a chromosome dose forchromosome 18, wherein the chromosome dose is calculated as the ratio ofthe sequence tag density ratio for chromosome 18 and the sequence tagdensity ratio for said at least one normalizing chromosome. The at leastone normalizing chromosome is a chromosome having the smallestvariability and/or the greatest differentiability. The at least onenormalizing chromosome is selected from chromosome 8, chromosome 2,chromosome 3, chromosome 4, chromosome 5, chromosome 6, chromosome 7,chromosome 9, chromosome 10, chromosome 11, chromosome 12, chromosome13, and chromosome 14. Preferably, the normalizing sequence forchromosome 18 is selected from chromosome 8, chromosome 2, chromosome 3,chromosome 5, chromosome 6, chromosome 12, and chromosome 14.Alternatively, the normalizing sequence for chromosome 18 is a group ofchromosomes selected from chromosome 8, chromosome 2, chromosome 3,chromosome 4, chromosome 5, chromosome 6, chromosome 7, chromosome 9,chromosome 10, chromosome 11, chromosome 12, chromosome 13, andchromosome 14. Preferably, the group of chromosomes is a group selectedfrom chromosome 8, chromosome 2, chromosome 3, chromosome 5, chromosome6, chromosome 12, and chromosome 14.

Preferably, the group of chromosomes is a group selected from chromosome8, chromosome 2, chromosome 3, chromosome 5, chromosome 6, chromosome12, and chromosome 14. In one embodiment, the fetal and maternal nucleicacid molecules are cell-free DNA molecules. In some embodiments, thematernal blood sample is a plasma sample. In some embodiments, thesequencing method for identifying the fetal trisomy 18 is a nextgeneration sequencing method. In some embodiments, the sequencing methodis a massively parallel sequencing method that usessequencing-by-synthesis with reversible dye terminators. In otherembodiments, the sequencing method is sequencing-by-ligation. In someembodiments, sequencing comprises an amplification. In otherembodiments, sequencing is single molecule sequencing.

In one embodiment, the invention provides a method for identifying fetaltrisomy 13, said method comprising the steps: (a) obtaining sequenceinformation for a plurality of fetal and maternal nucleic acid moleculesof a maternal blood sample e.g. a plasma sample; (b) using the sequenceinformation to identify a number of mapped sequence tags for chromosome13; (c) using the sequence information to identify a number of mappedsequence tags for at least one normalizing chromosome; (d) using thenumber of mapped sequence tags identified for chromosome 13 in step (b)and the number of mapped sequence tags identified for the at least onenormalizing chromosome in step (c) to calculate a chromosome dose forchromosome 13; and (e) comparing said chromosome dose to at least onethreshold value, and thereby identifying the presence or absence offetal trisomy 13. In one embodiment, step (d) comprises calculating achromosome dose for chromosome 13 as the ratio of the number of mappedsequence tags identified for chromosome 13 and the number of mappedsequence tags identified for the at least one normalizing chromosome.Alternatively, step (d) comprises (i) calculating a sequence tag densityratio for chromosome 13, by relating the number of mapped sequence tagsidentified for chromosome 13 in step (b) to the length of chromosome 13;(ii) calculating a sequence tag density ratio for said at least onenormalizing chromosome, by relating the number of mapped sequence tagsidentified for said at least one normalizing chromosome in step (c) tothe length of said at least one normalizing chromosome; and (iii) usingthe sequence tag density ratios calculated in steps (i) and (ii) tocalculate a chromosome dose for chromosome 13, wherein the chromosomedose is calculated as the ratio of the sequence tag density ratio forchromosome 13 and the sequence tag density ratio for said at least onenormalizing chromosome. The at least one normalizing chromosome is achromosome having the smallest variability and/or the greatestdifferentiability. The at least one normalizing chromosome is selectedfrom chromosome 2, chromosome 3, chromosome 4, chromosome 5, chromosome6, chromosome 7, chromosome 8, chromosome 9, chromosome 10, chromosome11, chromosome 12, chromosome 14, chromosome 18, and chromosome 21.Preferably, the normalizing sequence for chromosome 13 is a chromosomeselected from chromosome 2, chromosome 3, chromosome 4, chromosome 5,chromosome 6, and chromosome 8. In another embodiment, the normalizingsequence for chromosome 13 is a group of chromosomes selected fromchromosome 2, chromosome 3, chromosome 4, chromosome 5, chromosome 6,chromosome 7, chromosome 8, chromosome 9, chromosome 10, chromosome 11,chromosome 12, chromosome 14, chromosome 18, and chromosome 21.Preferably, the group of chromosomes is a group selected from chromosome2, chromosome 3, chromosome 4, chromosome 5, chromosome 6, andchromosome 8.

In one embodiment, the fetal and maternal nucleic acid molecules arecell-free DNA molecules. In some embodiments, the maternal blood sampleis a plasma sample. In some embodiments, the sequencing method foridentifying the fetal trisomy 13 is a next generation sequencing method.In some embodiments, the sequencing method is a massively parallelsequencing method that uses sequencing-by-synthesis with reversible dyeterminators. In other embodiments, the sequencing method issequencing-by-ligation. In some embodiments, sequencing comprises anamplification. In other embodiments, sequencing is single moleculesequencing.

In another embodiment, the invention provides a method for identifyingfetal trisomy 13 in a maternal blood sample e.g. a plasma samplecomprising fetal and maternal nucleic acid molecules, and comprises thesteps: (a) sequencing at least a portion of said nucleic acid molecules,thereby obtaining sequence information for a plurality of fetal andmaternal nucleic acid molecules of a maternal plasma sample; (b) usingthe sequence information to identify a number of mapped sequence tagsfor chromosome 13; (c) using the sequence information to identify anumber of mapped sequence tags for at least one normalizing chromosome;(d) using the number of mapped sequence tags identified for chromosome13 in step (b) and the number of mapped sequence tags identified for theat least one normalizing chromosome in step (c) to calculate achromosome dose for chromosome 13; and (e) comparing said chromosomedose to at least one threshold value, and thereby identifying thepresence or absence of fetal trisomy 13. In one embodiment, step (d)comprises calculating a chromosome dose for chromosome 13 as the ratioof the number of mapped sequence tags identified for chromosome 13 andthe number of mapped sequence tags identified for the at least onenormalizing chromosome. Alternatively, step (d) comprises (i)calculating a sequence tag density ratio for chromosome 13, by relatingthe number of mapped sequence tags identified for chromosome 13 in step(b) to the length of chromosome 13; (ii) calculating a sequence tagdensity ratio for said at least one normalizing chromosome, by relatingthe number of mapped sequence tags identified for said at least onenormalizing chromosome in step (c) to the length of said at least onenormalizing chromosome; and (iii) using the sequence tag density ratioscalculated in steps (i) and (ii) to calculate a chromosome dose forchromosome 13, wherein the chromosome dose is calculated as the ratio ofthe sequence tag density ratio for chromosome 13 and the sequence tagdensity ratio for said at least one normalizing chromosome. The at leastone normalizing chromosome is a chromosome having the smallestvariability and/or the greatest differentiability. The at least onenormalizing chromosome is selected from chromosome 2, chromosome 3,chromosome 4, chromosome 5, chromosome 6, chromosome 7, chromosome 8,chromosome 9, chromosome 10, chromosome 11, chromosome 12, chromosome14, chromosome 18, and chromosome 21. Preferably, the normalizingsequence for chromosome 13 is a chromosome selected from chromosome 2,chromosome 3, chromosome 4, chromosome 5, chromosome 6, and chromosome8. In another embodiment, the normalizing sequence for chromosome 13 isa group of chromosomes selected from chromosome 2, chromosome 3,chromosome 4, chromosome 5, chromosome 6, chromosome 7, chromosome 8,chromosome 9, chromosome 10, chromosome 11, chromosome 12, chromosome14, chromosome 18, and chromosome 21. Preferably, the group ofchromosomes is a group selected from chromosome 2, chromosome 3,chromosome 4, chromosome 5, chromosome 6, and chromosome 8.

In one embodiment, the fetal and maternal nucleic acid molecules arecell-free DNA molecules. In some embodiments, the maternal blood sampleis a plasma sample. In some embodiments, the sequencing method foridentifying the fetal trisomy 13 is a next generation sequencing method.In some embodiments, the sequencing method is a massively parallelsequencing method that uses sequencing-by-synthesis with reversible dyeterminators. In other embodiments, the sequencing method issequencing-by-ligation. In some embodiments, sequencing comprises anamplification. In other embodiments, sequencing is single moleculesequencing.

In one embodiment, the invention provides a method for identifying fetalmonosomy X, said method comprising the steps: (a) obtaining sequenceinformation for a plurality of fetal and maternal nucleic acid moleculesof a maternal blood sample e.g. a plasma sample; (b) using the sequenceinformation to identify a number of mapped sequence tags for chromosomeX; (c) using the sequence information to identify a number of mappedsequence tags for at least one normalizing chromosome; (d) using thenumber of mapped sequence tags identified for chromosome X in step (b)and the number of mapped sequence tags identified for the at least onenormalizing chromosome in step (c) to calculate a chromosome dose forchromosome X; and (e) comparing said chromosome dose to at least onethreshold value, and thereby identifying the presence or absence offetal monosomy X. In one embodiment, step (d) comprises calculating achromosome dose for chromosome X as the ratio of the number of mappedsequence tags identified for chromosome X and the number of mappedsequence tags identified for the at least one normalizing chromosome.Alternatively, step (d) comprises (i) calculating a sequence tag densityratio for chromosome X, by relating the number of mapped sequence tagsidentified for chromosome X in step (b) to the length of chromosome X;(ii) calculating a sequence tag density ratio for said at least onenormalizing chromosome, by relating the number of mapped sequence tagsidentified for said at least one normalizing chromosome in step (c) tothe length of said at least one normalizing chromosome; and (iii) usingthe sequence tag density ratios calculated in steps (i) and (ii) tocalculate a chromosome dose for chromosome X, wherein the chromosomedose is calculated as the ratio of the sequence tag density ratio forchromosome X and the sequence tag density ratio for said at least onenormalizing chromosome. The at least one normalizing chromosome is achromosome having the smallest variability and/or the greatestdifferentiability. The at least one normalizing chromosome is selectedfrom chromosome 1, chromosome 2, chromosome 3, chromosome 4, chromosome5, chromosome 6, chromosome 7, chromosome 8, chromosome 9, chromosome10, chromosome 11, chromosome 12, chromosome 13, chromosome 14,chromosome 15, and chromosome 16. Preferably, the normalizing sequencefor chromosome X is selected from chromosome 2, chromosome 3, chromosome4, chromosome 5, chromosome 6 and chromosome 8. Alternatively, thenormalizing sequence for chromosome X is a group of chromosomes selectedfrom chromosome 1, chromosome 2, chromosome 3, chromosome 4, chromosome5, chromosome 6, chromosome 7, chromosome 8, chromosome 9, chromosome10, chromosome 11, chromosome 12, chromosome 13, chromosome 14,chromosome 15, and chromosome 16. Preferably, the group of chromosomesis a group selected from chromosome 2, chromosome 3, chromosome 4,chromosome 5, chromosome 6, and chromosome 8.

In one embodiment, the fetal and maternal nucleic acid molecules arecell-free DNA molecules. In some embodiments, the maternal blood sampleis a plasma sample. In some embodiments, the sequencing method foridentifying the fetal monosomy X is a next generation sequencing method.In some embodiments, the sequencing method is a massively parallelsequencing method that uses sequencing-by-synthesis with reversible dyeterminators. In other embodiments, the sequencing method issequencing-by-ligation. In some embodiments, sequencing comprises anamplification. In other embodiments, sequencing is single moleculesequencing.

In another embodiment, the invention provides a method for identifyingfetal monosomy X in a maternal blood sample e.g. a plasma samplecomprising fetal and maternal nucleic acid molecules, and comprises thesteps: (a) sequencing at least a portion of said nucleic acid molecules,thereby obtaining sequence information for a plurality of fetal andmaternal nucleic acid molecules of a maternal plasma sample; (b) usingthe sequence information to identify a number of mapped sequence tagsfor chromosome X; (c) using the sequence information to identify anumber of mapped sequence tags for at least one normalizing chromosome;(d) using the number of mapped sequence tags identified for chromosome Xin step (b) and the number of mapped sequence tags identified for the atleast one normalizing chromosome in step (c) to calculate a chromosomedose for chromosome X; and (e) comparing said chromosome dose to atleast one threshold value, and thereby identifying the presence orabsence of fetal monosomy X. In one embodiment, step (d) comprisescalculating a chromosome dose for chromosome X as the ratio of thenumber of mapped sequence tags identified for chromosome X and thenumber of mapped sequence tags identified for the at least onenormalizing chromosome. Alternatively, step (d) comprises (i)calculating a sequence tag density ratio for chromosome X, by relatingthe number of mapped sequence tags identified for chromosome X in step(b) to the length of chromosome X; (ii) calculating a sequence tagdensity ratio for said at least one normalizing chromosome, by relatingthe number of mapped sequence tags identified for said at least onenormalizing chromosome in step (c) to the length of said at least onenormalizing chromosome; and (iii) using the sequence tag density ratioscalculated in steps (i) and (ii) to calculate a chromosome dose forchromosome X, wherein the chromosome dose is calculated as the ratio ofthe sequence tag density ratio for chromosome X and the sequence tagdensity ratio for said at least one normalizing chromosome. The at leastone normalizing chromosome is a chromosome having the smallestvariability and/or the greatest differentiability. The at least onenormalizing chromosome is selected from chromosome 1, chromosome 2,chromosome 3, chromosome 4, chromosome 5, chromosome 6, chromosome 7,chromosome 8, chromosome 9, chromosome 10, chromosome 11, chromosome 12,chromosome 13, chromosome 14, chromosome 15, and chromosome 16.Preferably, the normalizing sequence for chromosome X is selected fromchromosome 2, chromosome 3, chromosome 4, chromosome 5, chromosome 6 andchromosome 8. Alternatively, the normalizing sequence for chromosome Xis a group of chromosomes selected from chromosome 1, chromosome 2,chromosome 3, chromosome 4, chromosome 5, chromosome 6, chromosome 7,chromosome 8, chromosome 9, chromosome 10, chromosome 11, chromosome 12,chromosome 13, chromosome 14, chromosome 15, and chromosome 16.Preferably, the group of chromosomes is a group selected from chromosome2, chromosome 3, chromosome 4, chromosome 5, chromosome 6, andchromosome 8.

In one embodiment, the fetal and maternal nucleic acid molecules arecell-free DNA molecules. In some embodiments, the maternal blood sampleis a plasma sample. In some embodiments, the sequencing method foridentifying the fetal monosomy X is a next generation sequencing method.In some embodiments, the sequencing method is a massively parallelsequencing method that uses sequencing-by-synthesis with reversible dyeterminators. In other embodiments, the sequencing method issequencing-by-ligation. In some embodiments, sequencing comprises anamplification. In other embodiments, sequencing is single moleculesequencing.

In another embodiment, the invention provides a method for identifyingcopy number variation (CNV) of a sequence of interest e.g. a clinicallyrelevant sequence, in a test sample comprising the steps of: (a)obtaining a test sample and a plurality of qualified samples, said testsample comprising test nucleic acid molecules and said plurality ofqualified samples comprising qualified nucleic acid molecules; (b)sequencing at least a portion of said qualified and test nucleic acidmolecules, wherein said sequencing comprises providing a plurality ofmapped sequence tags for a test and a qualified sequence of interest,and for at least one test and at least one qualified normalizingsequence; (c) based on said sequencing of said qualified nucleic acidmolecules, calculating a qualified sequence dose for said qualifiedsequence of interest in each of said plurality of qualified samples,wherein said calculating a qualified sequence dose comprises determininga parameter for said qualified sequence of interest and at least onequalified normalizing sequence; (d) based on said qualified sequencedose, identifying at least one qualified normalizing sequence, whereinsaid at least one qualified normalizing sequence has the smallestvariability and/or the greatest differentiability in sequence dose insaid plurality of qualified samples; (e) based on said sequencing ofsaid nucleic acid molecules in said test sample, calculating a testsequence dose for said test sequence of interest, wherein saidcalculating a test sequence dose comprises determining a parameter forsaid test sequence of interest and at least one normalizing testsequence, and wherein said at least one normalizing test sequencecorresponds to said at least one qualified normalizing sequence; (f)comparing said test sequence dose to at least one threshold value; and(g) assessing said copy number variation of said sequence of interest insaid test sample based on the outcome of step (f). In one embodiment,the parameter for said qualified sequence of interest and at least onequalified normalizing sequence relates the number of sequence tagsmapped to said qualified sequence of interest to the number of tagsmapped to said qualified normalizing sequence, and wherein saidparameter for said test sequence of interest and at least onenormalizing test sequence relates the number of sequence tags mapped tosaid test sequence of interest to the number of tags mapped to saidnormalizing test sequence. In some embodiments, the sequencing step isperformed using next generation sequencing method. In some embodiments,the sequencing method is a massively parallel sequencing method thatuses sequencing-by-synthesis with reversible dye terminators. In otherembodiments, the sequencing method is sequencing-by-ligation. In someembodiments, sequencing comprises an amplification. In otherembodiments, sequencing is single molecule sequencing. The CNV of asequence of interest is an aneuploidy, which can be a chromosomal or apartial aneuploidy. In some embodiments, the chromosomal aneuploidy isselected from trisomy 8, trisomy 13, trisomy 15, trisomy 16, trisomy 18,trisomy 21, trisomy 22, monosomy X, and XXX. In other embodiments, thepartial aneuploidy is a partial chromosomal deletion or a partialchromosomal insertion. In some embodiments, the CNV identified by themethod is a chromosomal or partial aneuploidy associated with cancer. Insome embodiments, the test and qualified sample are biological fluidsamples e.g. plasma samples, obtained from a pregnant subject such as apregnant human subject. In other embodiments, a test and qualifiedbiological fluid samples e.g. plasma samples, are obtained from asubject that is known or is suspected of having cancer.

In another embodiment, the invention provides a method for identifying afetal chromosomal aneuploidy in a test sample, said method comprising:(a) obtaining a test sample comprising a test nucleic acid molecules anda plurality of qualified samples comprising qualified nucleic acidmolecules; (b) sequencing at least a portion of said qualified and testnucleic acid molecules, wherein said sequencing comprises providing aplurality of mapped sequence tags for a test and a qualified chromosomeof interest, and for at least one test and at least one qualifiednormalizing chromosome; (c) based on said sequencing of said qualifiedchromosomes, calculating a qualified chromosome dose for said qualifiedchromosome of interest in each of said plurality of qualified samples,wherein said calculating a qualified chromosome dose comprisesdetermining a parameter for said qualified chromosome of interest and atleast one qualified normalizing chromosome; (d) based on said qualifiedchromosome dose, identifying at least one qualified normalizingchromosome, wherein said at least one qualified normalizing chromosomehas the smallest variability and/or the greatest differentiability inchromosome dose in said plurality of qualified samples; (e) based onsaid sequencing of said nucleic acid molecules in said test sample,calculating a test chromosome dose for said test chromosome of interest,wherein said calculating a test chromosome dose comprises determining aparameter for said test chromosome of interest and at least onenormalizing test chromosome, and wherein said at least one normalizingtest chromosome corresponds to said at least one qualified normalizingchromosome; (f) comparing said test chromosome dose to at least onethreshold value; and (g) determining said chromosomal aneuploidy basedon the outcome of step (f). The parameter for said qualified chromosomeof interest and at least one qualified normalizing chromosome relatesthe number of sequence tags mapped to said qualified chromosome ofinterest to the number of tags mapped to said normalizing chromosomesequence, and wherein said parameter for said test chromosome ofinterest and at least one normalizing test chromosome relates the numberof sequence tags mapped to said test chromosome of interest to thenumber of tags mapped to said normalizing chromosome sequence.Chromosomes of interest include but are not limited to chromosome 8,chromosome 13, chromosome 15, chromosome 16, chromosome 18, chromosome21, chromosome 22, and chromosome X. Chromosomal aneuploidies that canbe identified using the method include but are not limited to fromtrisomy 8, trisomy 13, trisomy 15, trisomy 16, trisomy 18, trisomy 21,trisomy 22, monosomy X, and XXX.

In one embodiment, the test and qualified samples are substantiallycell-free biological samples. Biological samples are maternal samplesselected from maternal blood, plasma, serum, urine and saliva. In oneembodiment, the biological samples are maternal plasma samples. Thematernal samples comprise fetal and maternal nucleic acid molecules e.g.cell-free DNA. Sequencing of the fetal and maternal nucleic acidmolecules can be performed by next generation sequencing methods. Insome embodiments, the sequencing method is a massively parallelsequencing method that uses sequencing-by-synthesis with reversible dyeterminators. In other embodiments, the sequencing method issequencing-by-ligation. In some embodiments, sequencing comprises anamplification. In other embodiments, sequencing is single moleculesequencing.

Although the examples herein concern humans and the language isprimarily directed to human concerns, the concept of this invention isapplicable to genomes from any plant or animal.

4. INCORPORATION BY REFERENCE

All patents, patent applications, and other publications, including allsequences disclosed within these references, referred to herein areexpressly incorporated by reference, to the same extent as if eachindividual publication, patent or patent application was specificallyand individually indicated to be incorporated by reference. Alldocuments cited are, in relevant part, incorporated herein by reference.However, the citation of any document is not to be construed as anadmission that it is prior art with respect to the present invention.

5. BRIEF DESCRIPTION OF THE DRAWINGS

The novel features of the invention are set forth with particularity inthe appended claims. A better understanding of the features andadvantages of the present invention will be obtained by reference to thefollowing detailed description that sets forth illustrative embodiments,in which the principles of the invention are utilized, and theaccompanying drawings of which:

FIG. 1 is a flowchart of a method 100 for determining the presence orabsence of a copy number variation in a test sample comprising a mixtureof nucleic acids.

FIG. 2 illustrates the distribution of the chromosome dose forchromosome 21 determined from sequencing cfDNA extracted from a set of48 blood samples obtained from human subjects each pregnant with a maleor a female fetus. Chromosome 21 doses for qualified i.e. normal forchromosome 21 (O), and trisomy 21 test samples are shown (Δ) forchromosomes 1-12 and X (FIG. 2A), and for chromosomes 1-22 and X (FIG.2B).

FIG. 3 illustrates the distribution of the chromosome dose forchromosome 18 determined from sequencing cfDNA extracted from a set of48 blood samples obtained from human subjects each pregnant with a maleor a female fetus. Chromosome 18 doses for qualified i.e. normal forchromosome 18 (O), and trisomy 18 (Δ) test samples are shown forchromosomes 1-12 and X (FIG. 3A), and for chromosomes 1-22 and X (FIG.3B).

FIG. 4 illustrates the distribution of the chromosome dose forchromosome 13 determined from sequencing cfDNA extracted from a set of48 blood samples obtained from human subjects each pregnant with a maleor a female fetus. Chromosome 13 doses for qualified i.e. normal forchromosome 13 (O), and trisomy 13 (Δ) test samples are shown forchromosomes 1-12 and X (FIG. 4A), and for chromosomes 1-22 and X (FIG.4B).

FIG. 5 illustrates the distribution of the chromosome doses forchromosome X determined from sequencing cfDNA extracted from a set of 48test blood samples obtained from human subjects each pregnant with amale or a female fetus. Chromosome X doses for males (46,XY; (O)),females (46,XX; (Δ)); monosomy X (45,X; (+)), and complex karyotypes(Cplx (X)) samples are shown for chromosomes 1-12 and X (FIG. 5A), andfor chromosomes 1-22 and X (FIG. 5B).

FIG. 6 illustrates the distribution of the chromosome doses forchromosome Y determined from sequencing cfDNA extracted from a set of 48test blood samples obtained from human subjects each pregnant with amale or a female fetus. Chromosome Y doses for males (46,XY; (Δ)),females (46,XX; (O)); monosomy X (45,X; (+)), and complex karyotypes(Cplx (X)) samples are shown for chromosomes 1-12 (FIG. 6A), and forchromosomes 1-22 (FIG. 6B).

FIG. 7 shows the coefficient of variation (CV) for chromosomes 21 (▪),18 () and 13 (▴) that was determined from the doses shown in FIGS. 2,3, and 4, respectively.

FIG. 8 shows the coefficient of variation (CV) for chromosomes X (▪) andY () that was determined from the doses shown in FIGS. 5 and 6,respectively.

FIG. 9 shows the cumulative distribution of GC fraction by humanchromosome. The vertical axis represents the frequency of the chromosomewith GC content below the value shown on the horizontal axis.

FIG. 10 illustrates the sequence doses (Y-axis) for a segment ofchromosome 11 (81000082-103000103 bp) determined from sequencing cfDNAextracted from a set of 7 qualified samples (O) obtained and 1 testsample (♦) from pregnant human subjects. A sample from a subjectcarrying a fetus with a partial aneuploidy of chromosome 11 (♦) wasidentified.

FIG. 11 illustrates the distribution of normalized chromosome doses forchromosome 21 (A), chromosome 18 (B), chromosome 13 (C), chromosome X(D) and chromosome Y (E) relative to the standard deviation of the mean(Y-axis) for the corresponding chromosomes in the unaffected samples.

6. DETAILED DESCRIPTION OF THE INVENTION

The invention provides a method for determining copy number variations(CNV) of a sequence of interest in a test sample that comprises amixture of nucleic acids that are known or are suspected to differ inthe amount of one or more sequence of interest. Sequences of interestinclude genomic sequences ranging from hundreds of bases to tens ofmegabases to entire chromosomes that are known or are suspected to beassociated with a genetic or a disease condition. Examples of sequencesof interest include chromosomes associated with well known aneuploidiese.g. trisomy 21, and segments of chromosomes that are multiplied indiseases such as cancer e.g. partial trisomy 8 in acute myeloidleukemia. The method comprises a statistical approach that accounts foraccrued variability stemming from process-related, interchromosomal, andinter-sequencing variability. The method is applicable to determiningCNV of any fetal aneuploidy, and CNVs known or suspected to beassociated with a variety of medical conditions.

Unless otherwise indicated, the practice of the present inventioninvolves conventional techniques commonly used in molecular biology,microbiology, protein purification, protein engineering, protein and DNAsequencing, and recombinant DNA fields, which are within the skill ofthe art. Such techniques are known to those of skill in the art and aredescribed in numerous standard texts and reference works. All patents,patent applications, articles and publications mentioned herein arehereby expressly incorporated herein by reference in their entirety.

Numeric ranges are inclusive of the numbers defining the range. It isintended that every maximum numerical limitation given throughout thisspecification includes every lower numerical limitation, as if suchlower numerical limitations were expressly written herein. Every minimumnumerical limitation given throughout this specification will includeevery higher numerical limitation, as if such higher numericallimitations were expressly written herein. Every numerical range giventhroughout this specification will include every narrower numericalrange that falls within such broader numerical range, as if suchnarrower numerical ranges were all expressly written herein.

The headings provided herein are not limitations of the various aspectsor embodiments of the invention which can be had by reference to theSpecification as a whole. Accordingly, as indicated above, the termsdefined immediately below are more fully defined by reference to thespecification as a whole.

Unless defined otherwise herein, all technical and scientific terms usedherein have the same meaning as commonly understood by one of ordinaryskill in the art to which this invention belongs. Various scientificdictionaries that include the terms included herein are well known andavailable to those in the art. Although any methods and materialssimilar or equivalent to those described herein find use in the practiceor testing of the present invention, some preferred methods andmaterials are described. Accordingly, the terms defined immediatelybelow are more fully described by reference to the Specification as awhole. It is to be understood that this invention is not limited to theparticular methodology, protocols, and reagents described, as these mayvary, depending upon the context they are used by those of skill in theart.

6.1 DEFINITIONS

As used herein, the singular terms “a”, “an,” and “the” include theplural reference unless the context clearly indicates otherwise. Unlessotherwise indicated, nucleic acids are written left to right in 5′ to 3′orientation and amino acid sequences are written left to right in aminoto carboxy orientation, respectively.

The term “assessing” herein refers to characterizing the status of achromosomal aneuploidy by one of three types of calls: “normal”,“affected”, and “no-call”. For example, in the presence of trisomy the“normal” call is determined by the value of a parameter e.g. a testchromosome dose that is below a user-defined threshold of reliability,the “affected” call is determined by a parameter e.g. a test chromosomedose, that is above a user-defined threshold of reliability, and the“no-call” result is determined by a parameter e.g. a test chromosomedose, that lies between the a user-defined thresholds of reliability formaking a “normal” or an “affected” call.

The term “copy number variation” herein refers to variation in thenumber of copies of a nucleic acid sequence that is 1 kb or largerpresent in a test sample in comparison with the copy number of thenucleic acid sequence present in a qualified sample. A “copy numbervariant” refers to the 1 kb or larger sequence of nucleic acid in whichcopy-number differences are found by comparison of a sequence ofinterest in test sample with that present in a qualified sample. Copynumber variants/variations include deletions, including microdeletions,insertions, including microinsertions, duplications, multiplications,inversions, translocations and complex multi-site variants. CNVencompass chromosomal aneuploidies and partial aneuplodies.

The term “aneuploidy” herein refers to an imbalance of genetic materialcaused by a loss or gain of a whole chromosome, or part of a chromosome.

The term “chromosomal aneuploidy” herein refers to an imbalance ofgenetic material caused by a loss or gain of a whole chromosome, andincludes germline aneuploidy and mosaic aneuploidy.

The term “partial aneuploidy” herein refers to an imbalance of geneticmaterial caused by a loss or gain of part of a chromosome e.g. partialmonosomy and partial trisomy, and encompasses imbalances resulting fromtranslocations, deletions and insertions.

The term “plurality” is used herein in reference to a number of nucleicacid molecules or sequence tags that is sufficient to identifysignificant differences in copy number variations (e.g. chromosomedoses) in test samples and qualified samples using in the methods of theinvention. In some embodiments, at least about 3×10⁶ sequence tags, atleast about 5×10⁶ sequence tags, at least about 8×10⁶ sequence tags, atleast about 10×10⁶ sequence tags, at least about 15×10⁶ sequence tags,at least about 20×10⁶ sequence tags, at least about 30×10⁶ sequencetags, at least about 40×10⁶ sequence tags, or at least about 50×10⁶sequence tags comprising between 20 and 40 bp reads are obtained foreach test sample.

The terms “polynucleotide”, “nucleic acid” and “nucleic acid molecules”are used interchangeably and refer to a covalently linked sequence ofnucleotides (i.e., ribonucleotides for RNA and deoxyribonucleotides forDNA) in which the 3′ position of the pentose of one nucleotide is joinedby a phosphodiester group to the 5′ position of the pentose of the next,include sequences of any form of nucleic acid, including, but notlimited to RNA, DNA and cfDNA molecules. The term “polynucleotide”includes, without limitation, single- and double-strandedpolynucleotide.

The term “portion” is used herein in reference to the amount of sequenceinformation of fetal and maternal nucleic acid molecules in a biologicalsample that in sum amount to less than the sequence information of <1human genome.

The term “test sample” herein refers to a sample comprising a mixture ofnucleic acids comprising at least one nucleic acid sequence whose copynumber is suspected of having undergone variation. Nucleic acids presentin a test sample are referred to as “test nucleic acids”.

The term “qualified sample” herein refers to a sample comprising amixture of nucleic acids that are present in a known copy number towhich the nucleic acids in a test sample are compared, and it is asample that is normal i.e. not aneuploid, for the sequence of intereste.g. a qualified sample used for identifying a normalizing chromosomefor chromosome 21 is a sample that is not a trisomy 21 sample.

The term “qualified nucleic acid” is used interchangeably with“qualified sequence” is a sequence against which the amount of a testsequence or test nucleic acid is compared. A qualified sequence is onepresent in a biological sample preferably at a known representation i.e.the amount of a qualified sequence is known. A “qualified sequence ofinterest” is a qualified sequence for which the amount is known in aqualified sample, and is a sequence that is associated with a differencein sequence representation in an individual with a medical condition.

The term “sequence of interest” herein refers to a nucleic acid sequencethat is associated with a difference in sequence representation inhealthy versus diseased individuals. A sequence of interest can be asequence on a chromosome that is misrepresented i.e. over- orunder-represented, in a disease or genetic condition. A sequence ofinterest may also be a portion of a chromosome, or a chromosome. Forexample, a sequence of interest can be a chromosome that isover-represented in an aneuploidy condition, or a gene encoding atumor-suppressor that is under-represented in a cancer. Sequences ofinterest include sequences that are over- or under-represented in thetotal population, or a subpopulation of cells of a subject. A “qualifiedsequence of interest” is a sequence of interest in a qualified sample. A“test sequence of interest” is a sequence of interest in a test sample.

The term “normalizing sequence” herein refers to a sequence thatdisplays a variability in the number of sequence tags that are mapped toit among samples and sequencing runs that best approximates that of thesequence of interest for which it is used as a normalizing parameter,and that can best differentiate an affected sample from one or moreunaffected samples. A “normalizing chromosome” is an example of a“normalizing sequence”.

The term “differentiability” herein refers to the characteristic of anormalizing chromosome that enables to distinguish one or moreunaffected i.e. normal, samples from one or more affected i.e.aneuploid, samples.

The term “sequence dose” herein refers to a parameter that relates thesequence tag density of a sequence of interest to the tag density of anormalizing sequence. A “test sequence dose” is a parameter that relatesthe sequence tag density of a sequence of interest e.g. chromosome 21,to that of a normalizing sequence e.g. chromosome 9, determined in atest sample. Similarly, a “qualified sequence dose” is a parameter thatrelates the sequence tag density of a sequence of interest to that of anormalizing sequence determined in a qualified sample.

The term “sequence tag density” herein refers to the number of sequencereads that are mapped to a reference genome sequence e.g. the sequencetag density for chromosome 21 is the number of sequence reads generatedby the sequencing method that are mapped to chromosome 21 of thereference genome. The term “sequence tag density ratio” herein refers tothe ratio of the number of sequence tags that are mapped to a chromosomeof the reference genome e.g. chromosome 21, to the length of thereference genome chromosome 21.

The term “parameter” herein refers to a numerical value thatcharacterizes a quantitative data set and/or a numerical relationshipbetween quantitative data sets. For example, a ratio (or function of aratio) between the number of sequence tags mapped to a chromosome andthe length of the chromosome to which the tags are mapped, is aparameter.

The terms “threshold value” and “qualified threshold value” herein referto any number that is calculated using a qualifying data set and servesas a limit of diagnosis of a copy number variation e.g. an aneuploidy,in an organism. If a threshold is exceeded by results obtained frompracticing the invention, a subject can be diagnosed with a copy numbervariation e.g. trisomy 21.

The term “read” refers to a DNA sequence of sufficient length (e.g., atleast about 30 bp) that can be used to identify a larger sequence orregion, e.g. that can be aligned and specifically assigned to achromosome or genomic region or gene.

The term “sequence tag” is herein used interchangeably with the term“mapped sequence tag” to refer to a sequence read that has beenspecifically assigned i.e. mapped, to a larger sequence e.g. a referencegenome, by alignment. Mapped sequence tags are uniquely mapped to areference genome i.e. they are assigned to a single location to thereference genome. Tags that can be mapped to more than one location on areference genome i.e. tags that do not map uniquely, are not included inthe analysis.

As used herein, the terms “aligned”, “alignment”, or “aligning” refer toone or more sequences that are identified as a match in terms of theorder of their nucleic acid molecules to a known sequence from areference genome. Such alignment can be done manually or by a computeralgorithm, examples including the Efficient Local Alignment ofNucleotide Data (ELAND) computer program distributed as part of theIllumina Genomics Analysis pipeline. The matching of a sequence read inaligning can be a 100% sequence match or less than 100% (non-perfectmatch).

As used herein, the term “reference genome” refers to any particularknown genome sequence, whether partial or complete, of any organism orvirus which may be used to reference identified sequences from asubject. For example, a reference genome used for human subjects as wellas many other organisms is found at the National Center forBiotechnology Information at www.ncbi.nlm.nih.gov. A “genome” refers tothe complete genetic information of an organism or virus, expressed innucleic acid sequences.

The term “clinically-relevant sequence” herein refers to a nucleic acidsequence that is known or is suspected to be associated or implicatedwith a genetic or disease condition. Determining the absence or presenceof a clinically-relevant sequence can be useful in determining adiagnosis or confirming a diagnosis of a medical condition, or providinga prognosis for the development of a disease.

The term “derived” when used in the context of a nucleic acid or amixture of nucleic acids, herein refers to the means whereby the nucleicacid(s) are obtained from the source from which they originate. Forexample, in one embodiment, a mixture of nucleic acids that is derivedfrom two different genomes means that the nucleic acids e.g. cfDNA, werenaturally released by cells through naturally occurring processes suchas necrosis or apoptosis. In another embodiment, a mixture of nucleicacids that is derived from two different genomes means that the nucleicacids were extracted from two different types of cells from a subject.

The term “mixed sample” herein refers to a sample containing a mixtureof nucleic acids, which are derived from different genomes.

The term “maternal sample” herein refers to a biological sample obtainedfrom a pregnant subject e.g. a woman.

The term “biological fluid” herein refers to a liquid taken from abiological source and includes, for example, blood, serum, plasma,sputum, lavage fluid, cerebrospinal fluid, urine, semen, sweat, tears,saliva, and the like. As used herein, the terms “blood,” “plasma” and“serum” expressly encompass fractions or processed portions thereof.Similarly, where a sample is taken from a biopsy, swab, smear, etc., the“sample” expressly encompasses a processed fraction or portion derivedfrom the biopsy, swab, smear, etc.

The terms “maternal nucleic acids” and “fetal nucleic acids” hereinrefer to the nucleic acids of a pregnant female subject and the nucleicacids of the fetus being carried by the pregnant female, respectively.

As used herein, the term “corresponding to” refers to a nucleic acidsequence e.g. a gene or a chromosome, that is present in the genome ofdifferent subjects, and which does not necessarily have the samesequence in all genomes, but serves to provide the identity rather thanthe genetic information of a sequence of interest e.g. a gene orchromosome.

As used herein, the term “substantially cell free” encompassespreparations of the desired sample from which components that arenormally associated with it are removed. For example, a plasma sample isrendered essentially cell free by removing blood cells e.g. red cells,which are normally associated with it. In some embodiments,substantially free samples are processed to remove cells that wouldotherwise contribute to the desired genetic material that is to betested for a CNV.

As used herein, the term “fetal fraction” refers to the fraction offetal nucleic acids present in a sample comprising fetal and maternalnucleic acid.

As used herein the term “chromosome” refers to the heredity-bearing genecarrier of a living cell which is derived from chromatin and whichcomprises DNA and protein components (especially histones). Theconventional internationally recognized individual human genomechromosome numbering system is employed herein.

As used herein, the term “polynucleotide length” refers to the absolutenumber of nucleic acid molecules (nucleotides) in a sequence or in aregion of a reference genome. The term “chromosome length” refers to theknown length of the chromosome given in base pairs e.g. provided in theNCBI36/hg18 assembly of the human chromosome found on the world wide webat genome.ucsc.edu/cgi-bin/hgTracks?hgsid=167155613&chromInfoPage=

The term “subject” herein refers to a human subject as well as anon-human subject such as a mammal, an invertebrate, a vertebrate, afungus, a yeast, a bacteria, and a virus. Although the examples hereinconcern humans and the language is primarily directed to human concerns,the concept of this invention is applicable to genomes from any plant oranimal, and is useful in the fields of veterinary medicine, animalsciences, research laboratories and such.

The term “condition” herein refers to “medical condition” as a broadterm that includes all diseases and disorders, but can include[injuries] and normal health situations, such as pregnancy, that mightaffect a person's health, benefit from medical assistance, or haveimplications for medical treatments.

6.2 DESCRIPTION

The invention provides a method for determining copy number variations(CNV) of a sequence of interest in a test sample that comprises amixture of nucleic acids derived from two different genomes, and whichare known or are suspected to differ in the amount of one or moresequence of interest. Copy number variations determined by the method ofthe invention include gains or losses of entire chromosomes, alterationsinvolving very large chromosomal segments that are microscopicallyvisible, and an abundance of sub-microscopic copy number variation ofDNA segments ranging from kilobases (kb) to megabases (Mb) in size.

CNV in the human genome significantly influence human diversity andpredisposition to disease (Redon et al., Nature 23:444-454[2006], Shaikhet al. Genome Res 19:1682-1690[2009]). CNVs have been known tocontribute to genetic disease through different mechanisms, resulting ineither imbalance of gene dosage or gene disruption in most cases. Inaddition to their direct correlation with genetic disorders, CNVs areknown to mediate phenotypic changes that can be deleterious. Recently,several studies have reported an increased burden of rare or de novoCNVs in complex disorders such as Autism, ADHD, and schizophrenia ascompared to normal controls, highlighting the potential pathogenicity ofrare or unique CNVs (Sebat et al., 316:445-449[2007]; Walsh et al.,Science 320:539-543 [2008]). CNV arise from genomic rearrangements,primarily owing to deletion, duplication, insertion, and unbalancedtranslocation events.

In one embodiment, the method described herein employs next generationsequencing technology (NGS) in which clonally amplified DNA templates orsingle DNA molecules are sequenced in a massively parallel fashionwithin a flow cell (e.g. as described in Volkerding et al. Clin Chem55:641-658[2009]; Metzker M Nature Rev 11:31-46 [2010]). In addition tohigh-throughput sequence information, NGS provides digital quantitativeinformation, in that each sequence read is a countable “sequence tag”representing an individual clonal DNA template or a single DNA molecule.This quantification allows NGS to expand the digital PCR concept ofcounting cell-free DNA molecules (Fan et al., Proc Natl Acad Sci USA105:16266-16271[2008]; Chiu et al., Proc Natl Acad Sci USA 2008;105:20458-20463 [2008]). The sequencing technologies of NGS includepyrosequencing, sequencing-by-synthesis with reversible dye terminators,sequencing by oligonucleotide probe ligation and real time sequencing'

Some of the sequencing technologies are available commercially, such asthe sequencing-by-hybridization platform from Affymetrix Inc.(Sunnyvale, Calif.) and the sequencing-by-synthesis platforms from 454Life Sciences (Bradford, Conn.), Illumina/Solexa (Hayward, Calif.) andHelicos Biosciences (Cambridge, Mass.), and the sequencing-by-ligationplatform from Applied Biosystems (Foster City, Calif.), as describedbelow. In addition to the single molecule sequencing performed usingsequencing-by-synthesis of Helicos Biosciences, other single moleculesequencing technologies are encompassed by the method of the inventionand include the SMRT™ technology of Pacific Biosciences, the Ion Torrenttechnology, and nanopore sequencing being developed for example, byOxford Nanopore Technologies.

While the automated Sanger method is considered as a ‘first generation’technology, Sanger sequencing including the automated Sanger sequencing,can also be employed by the method of the invention. Additionalsequencing methods that comprise the use of developing nucleic acidimaging technologies e.g. atomic force microscopy (AFM) or transmissionelectron microscopy (TEM), are also encompassed by the method of theinvention. Exemplary sequencing technologies are described below.

In one embodiment, the DNA sequencing technology that is used in themethod of the invention is the Helicos True Single Molecule Sequencing(tSMS) (e.g. as described in Harris T. D. et al., Science320:106-109[2008]). In the tSMS technique, a DNA sample is cleaved intostrands of approximately 100 to 200 nucleotides, and a polyA sequence isadded to the 3′ end of each DNA strand. Each strand is labeled by theaddition of a fluorescently labeled adenosine nucleotide. The DNAstrands are then hybridized to a flow cell, which contains millions ofoligo-T capture sites that are immobilized to the flow cell surface. Thetemplates can be at a density of about 100 million templates/cm². Theflow cell is then loaded into an instrument, e.g., HeliScope™ sequencer,and a laser illuminates the surface of the flow cell, revealing theposition of each template. A CCD camera can map the position of thetemplates on the flow cell surface. The template fluorescent label isthen cleaved and washed away. The sequencing reaction begins byintroducing a DNA polymerase and a fluorescently labeled nucleotide. Theoligo-T nucleic acid serves as a primer. The polymerase incorporates thelabeled nucleotides to the primer in a template directed manner. Thepolymerase and unincorporated nucleotides are removed. The templatesthat have directed incorporation of the fluorescently labeled nucleotideare identified by imaging the flow cell surface. After imaging, acleavage step removes the fluorescent label, and the process is repeatedwith other fluorescently labeled nucleotides until the desired readlength is achieved. Sequence information is collected with eachnucleotide addition step.

In one embodiment, the DNA sequencing technology that is used in themethod of the invention is the 454 sequencing (Roche) (e.g. as describedin Margulies, M. et al. Nature 437:376-380[2005]). 454 sequencinginvolves two steps. In the first step, DNA is sheared into fragments ofapproximately 300-800 base pairs, and the fragments are blunt-ended.Oligonucleotide adaptors are then ligated to the ends of the fragments.The adaptors serve as primers for amplification and sequencing of thefragments. The fragments can be attached to DNA capture beads, e.g.,streptavidin-coated beads using, e.g., Adaptor B, which contains5′-biotin tag. The fragments attached to the beads are PCR amplifiedwithin droplets of an oil-water emulsion. The result is multiple copiesof clonally amplified DNA fragments on each bead. In the second step,the beads are captured in wells (pico-liter sized). Pyrosequencing isperformed on each DNA fragment in parallel. Addition of one or morenucleotides generates a light signal that is recorded by a CCD camera ina sequencing instrument. The signal strength is proportional to thenumber of nucleotides incorporated. Pyrosequencing makes use ofpyrophosphate (PPi) which is released upon nucleotide addition. PPi isconverted to ATP by ATP sulfurylase in the presence of adenosine 5′phosphosulfate. Luciferase uses ATP to convert luciferin tooxyluciferin, and this reaction generates light that is measured andanalyzed.

In one embodiment, the DNA sequencing technology that is used in themethod of the invention is the SOLiD™ technology (Applied Biosystems).In SOLiD™ sequencing-by-ligation, genomic DNA is sheared into fragments,and adaptors are attached to the 5′ and 3′ ends of the fragments togenerate a fragment library. Alternatively, internal adaptors can beintroduced by ligating adaptors to the 5′ and 3′ ends of the fragments,circularizing the fragments, digesting the circularized fragment togenerate an internal adaptor, and attaching adaptors to the 5′ and 3′ends of the resulting fragments to generate a mate-paired library. Next,clonal bead populations are prepared in microreactors containing beads,primers, template, and PCR components. Following PCR, the templates aredenatured and beads are enriched to separate the beads with extendedtemplates. Templates on the selected beads are subjected to a 3′modification that permits bonding to a glass slide. The sequence can bedetermined by sequential hybridization and ligation of partially randomoligonucleotides with a central determined base (or pair of bases) thatis identified by a specific fluorophore. After a color is recorded, theligated oligonucleotide is cleaved and removed and the process is thenrepeated.

In one embodiment, the DNA sequencing technology that is used in themethod of the invention is the single molecule, real-time (SMRT™)sequencing technology of Pacific Biosciences. In SMRT sequencing, thecontinuous incorporation of dye-labeled nucleotides is imaged during DNAsynthesis. Single DNA polymerase molecules are attached to the bottomsurface of individual zero-mode wavelength detectors (ZMW detectors)that obtain sequence information while phospholinked nucleotides arebeing incorporated into the growing primer strand. A ZMW is aconfinement structure which enables observation of incorporation of asingle nucleotide by DNA polymerase against the background offluorescent nucleotides that rapidly diffuse in an out of the ZMW (inmicroseconds). It takes several milliseconds to incorporate a nucleotideinto a growing strand. During this time, the fluorescent label isexcited and produces a fluorescent signal, and the fluorescent tag iscleaved off. Measurement of the corresponding fluorescence of the dyeindicates which base was incorporated. The process is repeated.

In one embodiment, the DNA sequencing technology that is used in themethod of the invention is nanopore sequencing (e.g. as described inSoni G V and Meller A. Clin Chem 53: 1996-2001 [2007]). Nanoporesequencing DNA analysis techniques are being industrially developed by anumber of companies, including Oxford Nanopore Technologies (Oxford,United Kingdom). Nanopore sequencing is a single-molecule sequencingtechnology whereby a single molecule of DNA is sequenced directly as itpasses through a nanopore. A nanopore is a small hole, of the order of 1nanometer in diameter. Immersion of a nanopore in a conducting fluid andapplication of a potential (voltage) across it results in a slightelectrical current due to conduction of ions through the nanopore. Theamount of current which flows is sensitive to the size and shape of thenanopore. As a DNA molecule passes through a nanopore, each nucleotideon the DNA molecule obstructs the nanopore to a different degree,changing the magnitude of the current through the nanopore in differentdegrees. Thus, this change in the current as the DNA molecule passesthrough the nanopore represents a reading of the DNA sequence.

In one embodiment, the DNA sequencing technology that is used in themethod of the invention is the chemical-sensitive field effecttransistor (chemFET) array (e.g., as described in U.S. PatentApplication Publication No. 20090026082). In one example of thetechnique, DNA molecules can be placed into reaction chambers, and thetemplate molecules can be hybridized to a sequencing primer bound to apolymerase. Incorporation of one or more triphosphates into a newnucleic acid strand at the 3′ end of the sequencing primer can bediscerned by a change in current by a chemFET. An array can havemultiple chemFET sensors. In another example, single nucleic acids canbe attached to beads, and the nucleic acids can be amplified on thebead, and the individual beads can be transferred to individual reactionchambers on a chemFET array, with each chamber having a chemFET sensor,and the nucleic acids can be sequenced.

In one embodiment, the DNA sequencing technology that is used in themethod of the invention is the Halcyon Molecular's method that usestransmission electron microscopy (TEM). The method, termed IndividualMolecule Placement Rapid Nano Transfer (IMPRNT), comprises utilizingsingle atom resolution transmission electron microscope imaging ofhigh-molecular weight (150 kb or greater) DNA selectively labeled withheavy atom markers and arranging these molecules on ultra-thin films inultra-dense (3 nm strand-to-strand) parallel arrays with consistentbase-to-base spacing. The electron microscope is used to image themolecules on the films to determine the position of the heavy atommarkers and to extract base sequence information from the DNA. Themethod is further described in PCT patent publication WO 2009/046445.The method allows for sequencing complete human genomes in less than tenminutes.

In one embodiment, the DNA sequencing technology is the Ion Torrentsingle molecule sequencing, which pairs semiconductor technology with asimple sequencing chemistry to directly translate chemically encodedinformation (A, C, G, T) into digital information (0, 1) on asemiconductor chip. In nature, when a nucleotide is incorporated into astrand of DNA by a polymerase, a hydrogen ion is released as abyproduct. Ion Torrent uses a high-density array of micro-machined wellsto perform this biochemical process in a massively parallel way. Eachwell holds a different DNA molecule. Beneath the wells is anion-sensitive layer and beneath that an ion sensor. When a nucleotide,for example a C, is added to a DNA template and is then incorporatedinto a strand of DNA, a hydrogen ion will be released. The charge fromthat ion will change the pH of the solution, which can be detected byIon Torrent's ion sensor. The sequencer essentially the world's smallestsolid-state pH meter calls the base, going directly from chemicalinformation to digital information. The Ion personal Genome Machine(PGM™) sequencer then sequentially floods the chip with one nucleotideafter another. If the next nucleotide that floods the chip is not amatch. No voltage change will be recorded and no base will be called. Ifthere are two identical bases on the DNA strand, the voltage will bedouble, and the chip will record two identical bases called. Directdetection allows recordation of nucleotide incorporation in seconds.

Other sequencing methods include digital PCR and sequencing byhybridization. Digital polymerase chain reaction (digital PCR or dPCR)can be used to directly identify and quantify nucleic acids in a sample.Digital PCR can be performed in an emulsion. Individual nucleic acidsare separated, e.g., in a microfluidic chamber device, and each nucleiccan is individually amplified by PCR. Nucleic acids can be separatedsuch there is an average of approximately 0.5 nucleic acids/well, or notmore than one nucleic acid/well. Different probes can be used todistinguish fetal alleles and maternal alleles. Alleles can beenumerated to determine copy number. In sequencing by hybridization, thehybridization comprises contacting the plurality of polynucleotidesequences with a plurality of polynucleotide probes, wherein each of theplurality of polynucleotide probes can be optionally tethered to asubstrate. The substrate might be flat surface comprising an array ofknown nucleotide sequences. The pattern of hybridization to the arraycan be used to determine the polynucleotide sequences present in thesample. In other embodiments, each probe is tethered to a bead, e.g., amagnetic bead or the like. Hybridization to the beads can be determinedand used to identify the plurality of polynucleotide sequences withinthe sample.

In one embodiment, the method employs massively parallel sequencing ofmillions of DNA fragments using Illumina's sequencing-by-synthesis andreversible terminator-based sequencing chemistry (e.g. as described inBentley et al., Nature 6:53-59[2009]). Template DNA can be genomic DNAe.g. cfDNA. In some embodiments, genomic DNA from isolated cells is usedas the template, and it is fragmented into lengths of several hundredbase pairs. In other embodiments, cfDNA is used as the template, andfragmentation is not required as cfDNA exists as short fragments. Forexample fetal cfDNA circulates in the bloodstream as fragments of <300bp, and maternal cfDNA has been estimated to circulate as fragments ofbetween about 0.5 and 1 Kb (Li et al., Clin Chem, 50: 1002-1011 [2004]).Illumina's sequencing technology relies on the attachment of fragmentedgenomic DNA to a planar, optically transparent surface on whicholigonucleotide anchors are bound. Template DNA is end-repaired togenerate 5′-phosphorylated blunt ends, and the polymerase activity ofKlenow fragment is used to add a single A base to the 3′ end of theblunt phosphorylated DNA fragments. This addition prepares the DNAfragments for ligation to oligonucleotide adapters, which have anoverhang of a single T base at their 3′ end to increase ligationefficiency. The adapter oligonucleotides are complementary to theflow-cell anchors. Under limiting-dilution conditions, adapter-modified,single-stranded template DNA is added to the flow cell and immobilizedby hybridization to the anchors. Attached DNA fragments are extended andbridge amplified to create an ultra-high density sequencing flow cellwith hundreds of millions of clusters, each containing ˜1,000 copies ofthe same template. In one embodiment, the randomly fragmented genomicDNA e.g. cfDNA, is amplified before it is subjected to clusteramplification. Alternatively, an amplification-free genomic librarypreparation is used, and the randomly fragmented genomic DNA e.g. cfDNAis enriched using the cluster amplification alone (Kozarewa et al.,Nature Methods 6:291-295[2009]). The templates are sequenced using arobust four-color DNA sequencing-by-synthesis technology that employsreversible terminators with removable fluorescent dyes. High-sensitivityfluorescence detection is achieved using laser excitation and totalinternal reflection optics. Short sequence reads of about 20-40 bp e.g.36 bp, are aligned against a repeat-masked reference genome and geneticdifferences are called using specially developed data analysis pipelinesoftware. After completion of the first read, the templates can beregenerated in situ to enable a second read from the opposite end of thefragments. Thus, either single-end or paired end sequencing of the DNAfragments is used according to the method. Partial sequencing of DNAfragments present in the sample is performed, and sequence tagscomprising reads of predetermined length e.g. 36 bp, that are mapped toa known reference genome are counted. In one embodiment, the referencegenome sequence is the NCBI36/hg18 sequence, which is available on theworld wide web atgenome.ucsc.edu/cgi-bin/hgGateway?org=Human&db=hg18&hgsid=166260105).Other sources of public sequence information include GenBank, dbEST,dbSTS, EMBL (the European Molecular Biology Laboratory), and the DDBJ(the DNA Databank of Japan). A number of computer algorithms areavailable for aligning sequences, including without limitation BLAST(Altschul et al., 1990), BLITZ (MPsrch) (Sturrock & Collins, 1993),FASTA (Person & Lipman, 1988), BOWTIE (Langmead et al., Genome Biology10:R25.1-R25.10[2009]), or ELAND (Illumina, Inc., San Diego, Calif.,USA). In one embodiment, one end of the clonally expanded copies of theplasma cfDNA molecules is sequenced and processed by bioinformaticalignment analysis for the Illumina Genome Analyzer, which uses theEfficient Large-Scale Alignment of Nucleotide Databases (ELAND)software.

In some embodiments of the method described herein, the mapped sequencetags comprise sequence reads of about 20 bp, about 25 bp, about 30 bp,about 35 bp, about 40 bp, about 45 bp, about 50 bp, about 55 bp, about60 bp, about 65 bp, about 70 bp, about 75 bp, about 80 bp, about 85 bp,about 90 bp, about 95 bp, about 100 bp, about 110 bp, about 120 bp,about 130, about 140 bp, about 150 bp, about 200 bp, about 250 bp, about300 bp, about 350 bp, about 400 bp, about 450 bp, or about 500 bp. It isexpected that technological advances will enable single-end reads ofgreater than 500 bp enabling for reads of greater than about 1000 bpwhen paired end reads are generated. In one embodiment, the mappedsequence tags comprise sequence reads that are 36 bp. Mapping of thesequence tags is achieved by comparing the sequence of the tag with thesequence of the reference to determine the chromosomal origin of thesequenced nucleic acid (e.g. cfDNA) molecule, and specific geneticsequence information is not needed. A small degree of mismatch (0-2mismatches per sequence tag) may be allowed to account for minorpolymorphisms that may exist between the reference genome and thegenomes in the mixed sample.

A plurality of sequence tags are obtained per sample. In someembodiments, at least about 3×10⁶ sequence tags, at least about 5×10⁶sequence tags, at least about 8×10⁶ sequence tags, at least about 10×10⁶sequence tags, at least about 15×10⁶ sequence tags, at least about20×10⁶ sequence tags, at least about 30×10⁶ sequence tags, at leastabout 40×10⁶ sequence tags, or at least about 50×10⁶ sequence tagscomprising between 20 and 40 bp reads e.g. 36 bp, are obtained frommapping the reads to the reference genome per sample. In one embodiment,all the sequence reads are mapped to all regions of the referencegenome. In one embodiment, the tags that have been mapped to all regionse.g. all chromosomes, of the reference genome are counted, and the CNVi.e. the over- or under-representation of a sequence of interest e.g. achromosome or portion thereof, in the mixed DNA sample is determined.The method does not require differentiation between the two genomes.

The accuracy required for correctly determining whether a CNV e.g.aneuploidy, is present or absent in a sample, is predicated on thevariation of the number of sequence tags that map to the referencegenome among samples within a sequencing run (inter-chromosomalvariability), and the variation of the number of sequence tags that mapto the reference genome in different sequencing runs (inter-sequencingvariability). For example, the variations can be particularly pronouncedfor tags that map to GC-rich or GC-poor reference sequences. The presentmethod uses chromosome doses based on the knowledge of normalizingchromosomes, to intrinsically account for the accrued variabilitystemming from interchromosomal, inter-sequencing and platform-dependentvariability.

FIG. 1 provides a flow diagram of an embodiment of method of theinvention 100 for determining a CNV of a sequence of interest in abiological sample. In some embodiments, a biological sample is obtainedfrom a subject and comprises a mixture of nucleic acids contributed bydifferent genomes. The different genomes can be contributed to thesample by two individuals e.g. the different genomes are contributed bythe fetus and the mother carrying the fetus. Alternatively, the genomesare contributed to the sample by aneuploid cancerous cells and normaleuploid cells from the same subject e.g. a plasma sample from a cancerpatient.

A set of qualified samples is obtained to identify qualified normalizingsequences and to provide variance values for use in determiningstatistically meaningful identification of CNV in test samples. In step110, a plurality of biological qualified samples are obtained from aplurality of subjects known to comprise cells having a normal copynumber for any one sequence of interest. In one embodiment, thequalified samples are obtained from mothers pregnant with a fetus thathas been confirmed using cytogenetic means to have a normal copy numberof chromosomes. The biological qualified samples may be a biologicalfluid e.g. plasma, or any suitable sample as described below. In someembodiments, a qualified sample contains a mixture of nucleic acidmolecules e.g. cfDNA molecules. In some embodiments, the qualifiedsample is a maternal plasma sample that contains a mixture of fetal andmaternal cfDNA molecules.

In step 120, at least a portion of each of all the qualified nucleicacids contained in the qualified samples are sequenced to generatesequence reads e.g. 36 bp reads, which are aligned to a referencegenome, e.g. hg18. In some embodiments, the sequence reads compriseabout 20 bp, about 25 bp, about 30 bp, about 35 bp, about 40 bp, about45 bp, about 50 bp, about 55 bp, about 60 bp, about 65 bp, about 70 bp,about 75 bp, about 80 bp, about 85 bp, about 90 bp, about 95 bp, about100 bp, about 110 bp, about 120 bp, about 130, about 140 bp, about 150bp, about 200 bp, about 250 bp, about 300 bp, about 350 bp, about 400bp, about 450 bp, or about 500 bp. It is expected that technologicaladvances will enable single-end reads of greater than 500 bp enablingfor reads of greater than about 1000 bp when paired end reads aregenerated. In one embodiment, the mapped sequence reads comprise 36 bp.Sequence reads are aligned to a reference genome, and the reads that areuniquely mapped to the reference genome are known as sequence tags. Inone embodiment, at least about 3×10⁶ qualified sequence tags, at leastabout 5×10⁶ qualified sequence tags, at least about 8×10⁶ qualifiedsequence tags, at least about 10×10⁶ qualified sequence tags, at leastabout 15×10⁶ qualified sequence tags, at least about 20×10⁶ qualifiedsequence tags, at least about 30×10⁶ qualified sequence tags, at leastabout 40×10⁶ qualified sequence tags, or at least about 50×10⁶ qualifiedsequence tags comprising between 20 and 40 bp reads are obtained fromreads that map uniquely to a reference genome.

In step 130, all the tags obtained from sequencing the nucleic acids inthe qualified samples are counted to determine a qualified sequence tagdensity. In one embodiment the sequence tag density is determined as thenumber of qualified sequence tags mapped to the sequence of interest onthe reference genome. In another embodiment, the qualified sequence tagdensity is determined as the number of qualified sequence tags mapped toa sequence of interest normalized to the length of the qualifiedsequence of interest to which they are mapped. Sequence tag densitiesthat are determined as a ratio of the tag density relative to the lengthof the sequence of interest are herein referred to as tag densityratios. Normalization to the length of the sequence of interest is notrequired, and may be included as a step to reduce the number of digitsin a number to simplify it for human interpretation. As all qualifiedsequence tags are mapped and counted in each of the qualified samples,the sequence tag density for a sequence of interest e.g. aclinically-relevant sequence, in the qualified samples is determined, asare the sequence tag densities for additional sequences from whichnormalizing sequences are identified subsequently. In one embodiment,the sequence of interest is a chromosome that is associated with achromosomal aneuploidy e.g. chromosome 21, and the qualified normalizingsequence is a chromosome that is not associated with a chromosomalaneuploidy and whose variation in sequence tag density best approximatesthat of chromosome 21. For example, a qualified normalizing sequence isa sequence that has the smallest variability. In some embodiments, thenormalizing sequence is a sequence that best distinguishes one or morequalified, samples from one or more affected samples i.e. thenormalizing sequence is a sequence that has the greatestdifferentiability. In other embodiments, the normalizing sequence is asequence that has the smallest variability and the greatestdifferentiability. The level of differentiability can be determined as astatistical difference between the chromosome doses in a population ofqualified samples and the chromosome dose(s) in one or more testsamples.

In another embodiment, the sequence of interest is a segment of achromosome associated with a partial aneuploidy, e.g. a chromosomaldeletion or insertion, or unbalanced chromosomal translocation, and thenormalizing sequence is a chromosomal segment that is not associatedwith the partial aneuploidy and whose variation in sequence tag densitybest approximates that of the chromosome segment associated with thepartial aneuploidy.

In step 140, based on the calculated qualified tag densities, aqualified sequence dose for a sequence of interest is determined as theratio of the sequence tag density for the sequence of interest and thequalified sequence tag density for additional sequences from whichnormalizing sequences are identified subsequently. In one embodiment,doses for the chromosome of interest e.g. chromosome 21, is determinedas a ratio of the sequence tag density of chromosome 21 and the sequencetag density for each of all the remaining chromosomes i.e. chromosomes1-20, chromosome 22, chromosome X, and chromosome Y.

In step 145, a normalizing sequence is identified for a sequence ofinterest in a qualified sample based on the calculated sequence doses.The method identifies sequences that inherently have similarcharacteristics and that are prone to similar variations among samplesand sequencing runs, and which are useful for determining sequence dosesin test samples. In some embodiments, more than one normalizing sequenceis identified. For example, the variation e.g. coefficient of variation,in chromosome dose for chromosome of interest 21 is least when thesequence tag density of chromosome 14 is used. In other embodiments,two, three, four, five, six, seven, eight or more normalizing sequencesare identified for use in determining a sequence dose for a sequence ofinterest in a test sample. In one embodiment, the normalizing sequencefor chromosome 21 is selected from chromosome 9, chromosome 1,chromosome 2, chromosome 3, chromosome 4, chromosome 5, chromosome 6,chromosome 7, chromosome 8, chromosome 10, chromosome 11, chromosome 12,chromosome 13, chromosome 14, chromosome 15, chromosome 16, andchromosome 17. Preferably, the normalizing sequence for chromosome 21 isselected from chromosome 9, chromosome 1, chromosome 2, chromosome 11,chromosome 12, and chromosome 14. Alternatively, the normalizingsequence for chromosome 21 is a group of chromosomes selected fromchromosome 9, chromosome 1, chromosome 2, chromosome 3, chromosome 4,chromosome 5, chromosome 6, chromosome 7, chromosome 8, chromosome 10,chromosome 11, chromosome 12, chromosome 13, chromosome 14, chromosome15, chromosome 16, and chromosome 17. Preferably, the group ofchromosomes is a group selected from chromosome 9, chromosome 1,chromosome 2, chromosome 11, chromosome 12, and chromosome 14.

In one embodiment, the normalizing sequence for chromosome 18 isselected chromosome 8, chromosome 2, chromosome 3, chromosome 4,chromosome 5, chromosome 6, chromosome 7, chromosome 9, chromosome 10,chromosome 11, chromosome 12, chromosome 13, and chromosome 14.Preferably, the normalizing sequence for chromosome 18 is selected fromchromosome 8, chromosome 2, chromosome 3, chromosome 5, chromosome 6,chromosome 12, and chromosome 14. Alternatively, the normalizingsequence for chromosome 18 is a group of chromosomes selected fromchromosome 8, chromosome 2, chromosome 3, chromosome 4, chromosome 5,chromosome 6, chromosome 7, chromosome 9, chromosome 10, chromosome 11,chromosome 12, chromosome 13, and chromosome 14. Preferably, the groupof chromosomes is a group selected from chromosome 8, chromosome 2,chromosome 3, chromosome 5, chromosome 6, chromosome 12, and chromosome14.

In one embodiment, the normalizing sequence for chromosome X is selectedfrom chromosome 1, chromosome 2, chromosome 3, chromosome 4, chromosome5, chromosome 6, chromosome 7, chromosome 8, chromosome 9, chromosome10, chromosome 11, chromosome 12, chromosome 13, chromosome 14,chromosome 15, and chromosome 16. Preferably, the normalizing sequencefor chromosome X is selected from chromosome 2, chromosome 3, chromosome4, chromosome 5, chromosome 6 and chromosome 8. Alternatively, thenormalizing sequence for chromosome X is a group of chromosomes selectedfrom chromosome 1, chromosome 2, chromosome 3, chromosome 4, chromosome5, chromosome 6, chromosome 7, chromosome 8, chromosome 9, chromosome10, chromosome 11, chromosome 12, chromosome 13, chromosome 14,chromosome 15, and chromosome 16. Preferably, the group of chromosomesis a group selected from chromosome 2, chromosome 3, chromosome 4,chromosome 5, chromosome 6, and chromosome 8.

In one embodiment, the normalizing sequence for chromosome 13 is achromosome selected from chromosome 2, chromosome 3, chromosome 4,chromosome 5, chromosome 6, chromosome 7, chromosome 8, chromosome 9,chromosome 10, chromosome 11, chromosome 12, chromosome 14, chromosome18, and chromosome 21. Preferably, the normalizing sequence forchromosome 13 is a chromosome selected from chromosome 2, chromosome 3,chromosome 4, chromosome 5, chromosome 6, and chromosome 8. In anotherembodiment, the normalizing sequence for chromosome 13 is a group ofchromosomes selected from chromosome 2, chromosome 3, chromosome 4,chromosome 5, chromosome 6, chromosome 7, chromosome 8, chromosome 9,chromosome 10, chromosome 11, chromosome 12, chromosome 14, chromosome18, and chromosome 21. Preferably, the group of chromosomes is a groupselected from chromosome 2, chromosome 3, chromosome 4, chromosome 5,chromosome 6, and chromosome 8.

The variation in chromosome dose for chromosome Y is greater than 30independently of which normalizing chromosome is used in determining thechromosome Y dose. Therefore, any one chromosome, or a group of two ormore chromosomes selected from chromosomes 1-22 and chromosome X can beused as the normalizing sequence for chromosome Y. In one embodiment,the at least one normalizing chromosome is a group of chromosomesconsisting of chromosomes 1-22, and chromosome X. In another embodiment,the group of chromosomes consists of chromosome 2, chromosome 3,chromosome 4, chromosome 5, and chromosome 6.

Based on the identification of the normalizing sequence(s) in qualifiedsamples, a sequence dose is determined for a sequence of interest in atest sample comprising a mixture of nucleic acids derived from genomeshat differ in one or more sequences of interest.

In step 115, a test sample is obtained from a subject suspected or knownto carry a clinically-relevant CNV of a sequence of interest. The testsample may be a biological fluid e.g. plasma, or any suitable sample asdescribed below. In some embodiments, a test sample contains a mixtureof nucleic acid molecules e.g. cfDNA molecules. In some embodiments, thetest sample is a maternal plasma sample that contains a mixture of fetaland maternal cfDNA molecules.

In step 125, at least a portion of the test nucleic acids in the testsample is sequenced to generate millions of sequence reads comprisingbetween 20 and 500 bp e.g. 36 bp. As in step 120, the reads generatedfrom sequencing the nucleic acids in the test sample are uniquely mappedto a reference genome. As described in step 120, at least about 3×10⁶qualified sequence tags, at least about 5×10⁶ qualified sequence tags,at least about 8×10⁶ qualified sequence tags, at least about 10×10⁶qualified sequence tags, at least about 15×10⁶ qualified sequence tags,at least about 20×10⁶ qualified sequence tags, at least about 30×10⁶qualified sequence tags, at least about 40×10⁶ qualified sequence tags,or at least about 50×10⁶ qualified sequence tags comprising between 20and 40 bp reads are obtained from reads that map uniquely to a referencegenome.

In step 135, all the tags obtained from sequencing the nucleic acids inthe test samples are counted to determine a test sequence tag density.In one embodiment, the number of test sequence tags mapped to a sequenceof interest is normalized to the known length of a sequence of interestto which they are mapped to provide a test sequence tag density ratio.As described for the qualified samples, normalization to the knownlength of a sequence of interest is not required, and may be included asa step to reduce the number of digits in a number to simplify it forhuman interpretation. As all the mapped test sequence tags are countedin the test sample, the sequence tag density for a sequence of intereste.g. a clinically-relevant sequence, in the test samples is determined,as are the sequence tag densities for additional sequences thatcorrespond to at least one normalizing sequence identified in thequalified samples.

In step 150, based on the identity of at least one normalizing sequencein the qualified samples, a test sequence dose is determined for asequence of interest in the test sample. The sequence dose for asequence of interest in a test sample is a ratio of the sequence tagdensity determined for the sequence of interest in the test sample andthe sequence tag density of at least one normalizing sequence determinedin the test sample, wherein the normalizing sequence in the test samplecorresponds to the normalizing sequence identified in the qualifiedsamples for the particular sequence of interest. For example, if thenormalizing sequence identified for chromosome 21 in the qualifiedsamples is determined to be chromosome 14, then the test sequence dosefor chromosome 21 (sequence of interest) is determined as the ratio ofthe sequence tag density for chromosome 21 in and the sequence tagdensity for chromosome 14 each determined in the test sample. Similarly,chromosome doses for chromosomes 13, 18, X, Y, and other chromosomesassociated with chromosomal aneuploidies are determined. As describedpreviously, a sequence of interest can be part of a chromosome e.g. achromosome segment. Accordingly, the dose for a chromosome segment canbe determined as the ratio of the sequence tag density determined forthe segment in the test sample and the sequence tag density for thenormalizing chromosome segment in the test sample, wherein thenormalizing segment in the test sample corresponds to the normalizingsegment identified in the qualified samples for the particular segmentof interest.

In step 155, threshold values are derived from standard deviation valuesestablished for a plurality of qualified sequence doses. Accurateclassification depends on the differences between probabilitydistributions for the different classes i.e. type of aneuploidy.Preferably, thresholds are chosen from empirical distribution for eachtype of aneuploidy e.g. trisomy 21. Possible threshold values that wereestablished for classifying trisomy 13, trisomy 18, trisomy 21, andmonosomy X aneuploidies as described in the Examples, which describe theuse of the method for determining chromosomal aneuploidies by sequencingcfDNA extracted from a maternal sample comprising a mixture of fetal andmaternal nucleic acids.

In step 160, the copy number variation of the sequence of interest isdetermined in the test sample by comparing the test sequence dose forthe sequence of interest to at least one threshold value establishedfrom the qualified sequence doses.

In step 165, the calculated dose for a test sequence of interest iscompared to that set as the threshold values that are chosen accordingto a user-defined threshold of reliability to classify the sample as a“normal” an “affected” or a “no call”. The “no call” samples are samplesfor which a definitive diagnosis cannot be made with reliability.

Another embodiment of the invention provides a method for providingprenatal diagnosis of a fetal chromosomal aneuploidy in a biologicalsample comprising fetal and maternal nucleic acid molecules. Thediagnosis is made based on receiving the data from sequencing at least aportion of the mixture of the fetal and maternal nucleic acid moleculesderived from a biological test sample e.g. a maternal plasma sample,computing from the sequencing data a normalizing chromosome dose for oneor more chromosomes of interest, determining a statistically significantdifference between the normalizing chromosome dose for the chromosome ofinterest in the test sample and a threshold value established in aplurality of qualified (normal) samples, and providing the prenataldiagnosis based on the statistical difference. As described in step 165of the method, a diagnosis of normal or affected is made. A “no call” isprovided in the event that the diagnosis for normal or affected cannotbe made with confidence.

Samples

Samples that are used for determining a CNV e.g. chromosomal and partialaneuploidies, comprise nucleic acids that are present in cells or thatare “cell-free”. In some embodiments of the invention it is advantageousto obtain cell-free nucleic acids e.g. cell-free DNA (cfDNA). Cell-freenucleic acids, including cell-free DNA, can be obtained by variousmethods known in the art from biological samples including but notlimited to plasma and serum (Chen et al., Nature Med. 2: 1033-1035[1996]; Lo et al., Lancet 350: 485-487 [1997]). To separate cell-freeDNA from cells, fractionation, centrifugation (e.g., density gradientcentrifugation), DNA-specific precipitation, or high-throughput cellsorting and/or separation methods can be used. Examples of methods forprocessing fluid samples have been previously disclosed, e.g., U.S.Patent Application Nos. 20050282293, 20050224351, and 20050065735.

The sample comprising the mixture of nucleic acids to which the methodsdescribed herein are applied is a biological sample such as a tissuesample, a biological fluid sample, or a cell sample. In someembodiments, the mixture of nucleic acids is purified or isolated fromthe biological sample by any one of the known methods. A sample canconsist of purified or isolated polynucleotide, or it can comprise abiological sample such as a tissue sample, a biological fluid sample, ora cell sample. A biological fluid includes, as non-limiting examples,blood, plasma, serum, sweat, tears, sputum, urine, sputum, ear flow,lymph, saliva, cerebrospinal fluid, ravages, bone marrow suspension,vaginal flow, transcervical lavage, brain fluid, ascites, milk,secretions of the respiratory, intestinal and genitourinary tracts,amniotic fluid and leukophoresis samples. In some embodiments, thesample is a sample that is easily obtainable by non-invasive procedurese.g. blood, plasma, serum, sweat, tears, sputum, urine, sputum, earflow, and saliva. Preferably, the biological sample is a peripheralblood sample, or the plasma and serum fractions. In other embodiments,the biological sample is a swab or smear, a biopsy specimen, or a cellculture. In another embodiment, the sample is a mixture of two or morebiological samples e.g. a biological sample can comprise two or more ofa biological fluid sample, a tissue sample, and a cell culture sample.As used herein, the terms “blood,” “plasma” and “serum” expresslyencompass fractions or processed portions thereof. Similarly, where asample is taken from a biopsy, swab, smear, etc., the “sample” expresslyencompasses a processed fraction or portion derived from the biopsy,swab, smear, etc.

In some embodiments, samples can be obtained from sources, including,but not limited to, samples from different individuals, differentdevelopmental stages of the same or different individuals, differentdiseased individuals (e.g., individuals with cancer or suspected ofhaving a genetic disorder), normal individuals, samples obtained atdifferent stages of a disease in an individual, samples obtained from anindividual subjected to different treatments for a disease, samples fromindividuals subjected to different environmental factors, or individualswith predisposition to a pathology, or individuals with exposure to aninfectious disease agent (e.g., HIV).

In one embodiment, the sample is a maternal sample that is obtained froma pregnant female, for example a pregnant woman. In this instance, thesample can be analyzed using the methods described herein to provide aprenatal diagnosis of potential chromosomal abnormalities in the fetus.The maternal sample can be a tissue sample, a biological fluid sample,or a cell sample. A biological fluid includes, as non-limiting examples,blood, plasma, serum, sweat, tears, sputum, urine, sputum, ear flow,lymph, saliva, cerebrospinal fluid, ravages, bone marrow suspension,vaginal flow, transcervical lavage, brain fluid, ascites, milk,secretions of the respiratory, intestinal and genitourinary tracts, andleukophoresis samples. In some embodiments, the sample is a sample thatis easily obtainable by non-invasive procedures e.g. blood, plasma,serum, sweat, tears, sputum, urine, sputum, ear flow, and saliva. Insome embodiments, the biological sample is a peripheral blood sample, orthe plasma and serum fractions. In other embodiments, the biologicalsample is a swab or smear, a biopsy specimen, or a cell culture. Inanother embodiment, the maternal sample is a mixture of two or morebiological samples e.g. a biological sample can comprise two or more ofa biological fluid sample, a tissue sample, and a cell culture sample.As disclosed above, the terms “blood,” “plasma” and “serum” expresslyencompass fractions or processed portions thereof. Similarly, where asample is taken from a biopsy, swab, smear, etc., the “sample” expresslyencompasses a processed fraction or portion derived from the biopsy,swab, smear, etc.

Samples can also be obtained from in vitro cultured tissues, cells, orother polynucleotide-containing sources. The cultured samples can betaken from sources including, but not limited to, cultures (e.g., tissueor cells) maintained in different media and conditions (e.g., pH,pressure, or temperature), cultures (e.g., tissue or cells) maintainedfor different periods of length, cultures (e.g., tissue or cells)treated with different factors or reagents (e.g., a drug candidate, or amodulator), or cultures of different types of tissue or cells.

Methods of isolating nucleic acids from biological sources are wellknown and will differ depending upon the nature of the source. One ofskill in the art can readily isolate nucleic acid from a source asneeded for the method described herein. In some instances, it can beadvantageous to fragment the nucleic acid molecules in the nucleic acidsample. Fragmentation can be random, or it can be specific, as achieved,for example, using restriction endonuclease digestion. Methods forrandom fragmentation are well known in the art, and include, forexample, limited DNAse digestion, alkali treatment and physicalshearing. In one embodiment, sample nucleic acids are obtained from ascfDNA, which is not subjected to fragmentation. In other embodiments,the sample nucleic acids are obtained as genomic DNA, which is subjectedto fragmentation into fragments of approximately 500 or more base pairs,and to which NGS methods can be readily applied.

Determination of CNV for Prenatal Diagnoses

Cell-free fetal DNA and RNA circulating in maternal blood can be usedfor the early non-invasive prenatal diagnosis (NIPD) of an increasingnumber of genetic conditions, both for pregnancy management and to aidreproductive decision-making. The presence of cell-free DNA circulatingin the bloodstream has been known for over 50 years. More recently,presence of small amounts of circulating fetal DNA was discovered in thematernal bloodstream during pregnancy (Lo et al., Lancet350:485-487[1997]). Thought to originate from dying placental cells,cell-free fetal DNA (cfDNA) has been shown to consists of shortfragments typically fewer than 200 bp in length Chan et al., Clin Chem50:88-92[2004]), which can be discerned as early as 4 weeks gestation(Illanes et al., Early Human Dev 83:563-566 [2007]), and known to becleared from the maternal circulation within hours of delivery (Lo etal., Am J Hum Genet 64:218-224[1999]). In addition to cfDNA, fragmentsof cell-free fetal RNA (cfRNA) can also be discerned in the maternalbloodstream, originating from genes that are transcribed in the fetus orplacenta. The extraction and subsequent analysis of these fetal geneticelements from a maternal blood sample offers novel opportunities forNIPD.

The present method is a polymorphism-independent method that for use inNIPD and that does not require that the fetal cfDNA be distinguishedfrom the maternal cfDNA to enable the determination of a fetalaneuploidy. In some embodiments, the aneuploidy is a completechromosomal trisomy or monosomy, or a partial trisomy or monosomy.Partial aneuploidies are caused by loss or gain of part of a chromosome,and encompass chromosomal imbalances resulting from unbalancedtranslocations, unbalanced inversions, deletions and insertions. By far,the most common known aneuploidy compatible with life is trisomy 21 i.e.Down Syndrome (DS), which is caused by the presence of part or all ofchromosome 21. Rarely, DS can be cause by an inherited or sporadicdefect whereby an extra copy of all or part of chromosome 21 becomesattached to another chromosome (usually chromosome 14) to form a singleaberrant chromosome. DS is associated with intellectual impairment,severe learning difficulties and excess mortality caused by long-termhealth problems such as heart disease. Other aneuploidies with knownclinical significance include Edward syndrome (trisomy 18) and PatauSyndrome (trisomy 13), which are frequently fatal within the first fewmonths of life. Abnormalities associated with the number of sexchromosomes are also known and include monosomy X e.g. Turner syndrome(XO), and triple X syndrome (XXX) in female births and Kleinefeltersyndrome (XXY) and XYY syndrome in male births, which are all associatedwith various phenotypes including sterility and reduction inintellectual skills. The method of the invention can be used to diagnosethese and other chromosomal abnormalities prenatally.

According to embodiments of the present invention the trisomy determinedby the present invention is selected from trisomy 21 (T21; DownSyndrome), trisomy 18 (T18; Edward's Syndrome), trisomy 16 (T16),trisomy 22 (T22; Cat Eye Syndrome), trisomy 15 (T15; Prader WilliSyndrome), trisomy 13 (T13; Patau Syndrome), trisomy 8 (T8; WarkanySyndrome) and the XXY (Kleinefelter Syndrome), XYY, or XXX trisomies. Itwill be appreciated that various other trisomies and partial trisomiescan be determined in fetal cfDNA according to the teachings of thepresent invention. These include, but not limited to, partial trisomy1q32-44, trisomy 9 p with trisomy, trisomy 4 mosaicism, trisomy 17p,partial trisomy 4q26-qter, trisomy 9, partial 2p trisomy, partialtrisomy 1q, and/or partial trisomy 6p/monosomy 6q.

The method of the present invention can be also used to determinechromosomal monosomy X, and partial monosomies such as, monosomy 13,monosomy 15, monosomy 16, monosomy 21, and monosomy 22, which are knownto be involved in pregnancy miscarriage. Partial monosomy of chromosomestypically involved in complete aneuploidy can also be determined by themethod of the invention. Monosomy 18p is a rare chromosomal disorder inwhich all or part of the short arm (p) of chromosome 18 is deleted(monosomic). The disorder is typically characterized by short stature,variable degrees of mental retardation, speech delays, malformations ofthe skull and facial (craniofacial) region, and/or additional physicalabnormalities. Associated craniofacial defects may vary greatly in rangeand severity from case to case. Conditions caused by changes in thestructure or number of copies of chromosome 15 include Angelman Syndromeand Prader-Willi Syndrome, which involve a loss of gene activity in thesame part of chromosome 15, the 15q11-q13 region. It will be appreciatedthat several translocations and microdeletions can be asymptomatic inthe carrier parent, yet can cause a major genetic disease in theoffspring. For example, a healthy mother who carries the 15q11-q13microdeletion can give birth to a child with Angelman syndrome, a severeneurodegenerative disorder. Thus, the present invention can be used toidentify such a deletion in the fetus. Partial monosomy 13q is a rarechromosomal disorder that results when a piece of the long arm (q) ofchromosome 13 is missing (monosomic). Infants born with partial monosomy13q may exhibit low birth weight, malformations of the head and face(craniofacial region), skeletal abnormalities (especially of the handsand feet), and other physical abnormalities. Mental retardation ischaracteristic of this condition. The mortality rate during infancy ishigh among individuals born with this disorder. Almost all cases ofpartial monosomy 13q occur randomly for no apparent reason (sporadic).22q11.2 deletion syndrome, also known as DiGeorge syndrome, is asyndrome caused by the deletion of a small piece of chromosome 22. Thedeletion (22 q11.2) occurs near the middle of the chromosome on the longarm of one of the pair of chromosome. The features of this syndrome varywidely, even among members of the same family, and affect many parts ofthe body. Characteristic signs and symptoms may include birth defectssuch as congenital heart disease, defects in the palate, most commonlyrelated to neuromuscular problems with closure (velo-pharyngealinsufficiency), learning disabilities, mild differences in facialfeatures, and recurrent infections. Microdeletions in chromosomal region22q11.2 are associated with a 20 to 30-fold increased risk ofschizophrenia. In one embodiment, the method of the invention is used todetermine partial monosomies including but not limited to monosomy 18p,partial monosomy of chromosome 15 (15q11-q13), partial monosomy 13q, andpartial monosomy of chromosome 22 can also be determined using themethod.

The method of the invention can be also used to determine any aneuploidyif one of the parents is a known carrier of such abnormality. Theseinclude, but not limited to, mosaic for a small supernumerary markerchromosome (SMC); t(11;14)(p15;p13) translocation; unbalancedtranslocation t(8;11)(p23.2;p15.5); 11q23 microdeletion; Smith-Magenissyndrome 17p11.2 deletion; 22q13.3 deletion; Xp22.3 microdeletion; 10p14deletion; 20p microdeletion, DiGeorge syndrome [del(22)(q11.2q11.23)],Williams syndrome (7q11.23 and 7q36 deletions); 1p36 deletion; 2pmicrodeletion; neurofibromatosis type 1 (17q11.2 microdeletion), Yqdeletion; Wolf-Hirschhorn syndrome (WHS, 4p16.3 microdeletion); 1p36.2microdeletion; 11q14 deletion; 19q13.2 microdeletion; Rubinstein-Taybi(16 p13.3 microdeletion); 7p21 microdeletion; Miller-Dieker syndrome(17p13.3), 17p11.2 deletion; and 2q37 microdeletion.

Determination of CNV of Clinical Disorders

In addition to the early determination of birth defects, the methodsdescribed herein can be applied to the determination of any abnormalityin the representation of genetic sequences within the genome. It hasbeen shown that blood plasma and serum DNA from cancer patients containsmeasurable quantities of tumor DNA, which can be recovered and used assurrogate source of tumor DNA. Tumors are characterized by aneuploidy,or inappropriate numbers of gene sequences or even entire chromosomes.The determination of a difference in the amount of a given sequence i.e.a sequence of interest, in a sample from an individual can thus be usedin the diagnosis of a medical condition e.g. cancer.

Embodiments of the invention provide for a method to assess copy numbervariation of a sequence of interest e.g. a clinically-relevant sequence,in a test sample that comprises a mixture of nucleic acids derived fromtwo different genomes, and which are known or are suspected to differ inthe amount of one or more sequence of interest. The mixture of nucleicacids is derived from two or more types of cells. In one embodiment, themixture of nucleic acids is derived from normal and cancerous cellsderived from a subject suffering from a medical condition e.g. cancer.

It is believed that many solid tumors, such as breast cancer, progressfrom initiation to metastasis through the accumulation of severalgenetic aberrations. [Sato et al., Cancer Res., 50: 7184-7189 [1990];Jongsma et al., J Clin PAthol: Mol Path 55:305-309[2002])]. Such geneticaberrations, as they accumulate, may confer proliferative advantages,genetic instability and the attendant ability to evolve drug resistancerapidly, and enhanced angiogenesis, proteolysis and metastasis. Thegenetic aberrations may affect either recessive “tumor suppressor genes”or dominantly acting oncogenes. Deletions and recombination leading toloss of heterozygosity (LOH) are believed to play a major role in tumorprogression by uncovering mutated tumor suppressor alleles.

cfDNA has been found in the circulation of patients diagnosed withmalignancies including but not limited to lung cancer (Pathak et al.Clin Chem 52:1833-1842[2006]), prostate cancer (Schwartzenbach et al.Clin Cancer Res 15:1032-8[2009]), and breast cancer (Schwartzenbach etal. available online at breast-cancer-research.com/content/11/5/R71[2009]). Identification of genomic instabilities associated with cancersthat can be determined in the circulating cfDNA in cancer patients is apotential diagnostic and prognostic tool. In one embodiment, the methodof the invention assesses CNV of a sequence of interest in a samplecomprising a mixture of nucleic acids derived from a subject that issuspected or is known to have cancer e.g. carcinoma, sarcoma, lymphoma,leukemia, germ cell tumors and blastoma. In one embodiment, the sampleis a plasma sample derived (processes) from peripheral blood and thatcomprises a mixture of cfDNA derived from normal and cancerous cells. Inanother embodiment, the biological sample that is needed to determinewhether a CNV is present is derived from a mixture of cancerous andnon-cancerous cells from other biological fluids including but notlimited to serum, sweat, tears, sputum, urine, sputum, ear flow, lymph,saliva, cerebrospinal fluid, ravages, bone marrow suspension, vaginalflow, transcervical lavage, brain fluid, ascites, milk, secretions ofthe respiratory, intestinal and genitourinary tracts, and leukophoresissamples, or in tissue biopsies, swabs or smears.

The sequence of interest is a nucleic acid sequence that is known or issuspected to play a role in the development and/or progression of thecancer. Examples of a sequence of interest include nucleic acidssequences that are amplified or deleted in cancerous cells as describedin the following.

Dominantly acting genes associated with human solid tumors typicallyexert their effect by overexpression or altered expression. Geneamplification is a common mechanism leading to upregulation of geneexpression. Evidence from cytogenetic studies indicates that significantamplification occurs in over 50% of human breast cancers. Most notably,the amplification of the proto-oncogene human epidermal growth factorreceptor 2 (HER2) located on chromosome 17 (17(17q21-q22)), results inoverexpression of HER2 receptors on the cell surface leading toexcessive and dysregulated signaling in breast cancer and othermalignancies (Park et al., Clinical Breast Cancer 8:392-401 [2008]). Avariety of oncogenes have been found to be amplified in other humanmalignancies. Examples of the amplification of cellular oncogenes inhuman tumors include amplifications of: c-myc in promyelocytic leukemiacell line HL60, and in small-cell lung carcinoma cell lines, N-myc inprimary neuroblastomas (stages III and IV), neuroblastoma cell lines,retinoblastoma cell line and primary tumors, and small-cell lungcarcinoma lines and tumors, L-myc in small-cell lung carcinoma celllines and tumors, c-myb in acute myeloid leukemia and in colon carcinomacell lines, c-erbb in epidermoid carcinoma cell, and primary gliomas,c-K-ras-2 in primary carcinomas of lung, colon, bladder, and rectum,N-ras in mammary carcinoma cell line (Varmus H., Ann Rev Genetics 18:553-612 (1984) [cited in Watson et al., Molecular Biology of the Gene(4th ed.; Benjamin/Cummings Publishing Co. 1987)].

Chromosomal deletions involving tumor suppressor genes may play animportant role in the development and progression of solid tumors. Theretinoblastoma tumor suppressor gene (Rb-1), located in chromosome13q14, is the most extensively characterized tumor suppressor gene. TheRb-1 gene product, a 105 kDa nuclear phosphoprotein, apparently plays animportant role in cell cycle regulation (Howe et al., Proc Natl Acad Sci(USA) 87:5883-5887 [1990]). Altered or lost expression of the Rb proteinis caused by inactivation of both gene alleles either through a pointmutation or a chromosomal deletion. Rb-i gene alterations have beenfound to be present not only in retinoblastomas but also in othermalignancies such as osteosarcomas, small cell lung cancer (Rygaard etal., Cancer Res 50: 5312-5317 [1990)]) and breast cancer. Restrictionfragment length polymorphism (RFLP) studies have indicated that suchtumor types have frequently lost heterozygosity at 13q suggesting thatone of the Rb-1 gene alleles has been lost due to a gross chromosomaldeletion (Bowcock et al., Am J Hum Genet, 46: 12 [1990]). Chromosome 1abnormalities including duplications, deletions and unbalancedtranslocations involving chromosome 6 and other partner chromosomesindicate that regions of chromosome 1, in particular 1q21-1q32 and1p11-13, might harbor oncogenes or tumor suppressor genes that arepathogenetically relevant to both chronic and advanced phases ofmyeloproliferative neoplasms (Caramazza et al., Eur J Hemato184:191-200[2010]). Myeloproliferative neoplasms are also associatedwith deletions of chromosome 5. Complete loss or interstitial deletionsof chromosome 5 are the most common karyotypic abnormality inmyelodysplastic syndromes (MDSs). Isolated del(5q)/5q-MDS patients havea more favorable prognosis than those with additional karyotypicdefects, who tend to develop myeloproliferative neoplasms (MPNs) andacute myeloid leukemia. The frequency of unbalanced chromosome 5deletions has led to the idea that 5q harbors one or moretumor-suppressor genes that have fundamental roles in the growth controlof hematopoietic stem/progenitor cells (HSCs/HPCs). Cytogenetic mappingof commonly deleted regions (CDRs) centered on 5q31 and 5q32 identifiedcandidate tumor-suppressor genes, including the ribosomal subunit RPS14,the transcription factor Egr1/Krox20 and the cytoskeletal remodelingprotein, alpha-catenin (Eisenmann et al., Oncogene 28:3429-3441[2009]).Cytogenetic and allelotyping studies of fresh tumours and tumour celllines have shown that allelic loss from several distinct regions onchromosome 3p, including 3p25, 3p21-22, 3p21.3, 3p12-13 and 3p14, arethe earliest and most frequent genomic abnormalities involved in a widespectrum of major epithelial cancers of lung, breast, kidney, head andneck, ovary, cervix, colon, pancreas, esophagous, bladder and otherorgans. Several tumor suppressor genes have been mapped to thechromosome 3p region, and are thought that interstitial deletions orpromoter hypermethylation precede the loss of the 3p or the entirechromosome 3 in the development of carcinomas (Angeloni D., BriefingsFunctional Genomics 6:19-39[2007]).

Newborns and children with Down syndrome (DS) often present withcongenital transient leukemia and have an increased risk of acutemyeloid leukemia and acute lymphoblastic leukemia. Chromosome 21,harboring about 300 genes, may be involved in numerous structuralaberrations, e.g., translocations, deletions, and amplifications, inleukemias, lymphomas, and solid tumors. Moreover, genes located onchromosome 21 have been identified that play an important role intumorigenesis. Somatic numerical as well as structural chromosome 21aberrations are associated with leukemias, and specific genes includingRUNX1, TMPRSS2, and TFF, which are located in 21q, play a role intumorigenesis (Fonatsch C Gene Chromosomes Cancer 49:497-508[2010]).

In one embodiment, the method provides a means to assess the associationbetween gene amplification and the extent of tumor evolution.Correlation between amplification and/or deletion and stage or grade ofa cancer may be prognostically important because such information maycontribute to the definition of a genetically based tumor grade thatwould better predict the future course of disease with more advancedtumors having the worst prognosis. In addition, information about earlyamplification and/or deletion events may be useful in associating thoseevents as predictors of subsequent disease progression. Geneamplification and deletions as identified by the method can beassociated with other known parameters such as tumor grade, histology,Brd/Urd labeling index, hormonal status, nodal involvement, tumor size,survival duration and other tumor properties available fromepidemiological and biostatistical studies. For example, tumor DNA to betested by the method could include atypical hyperplasia, ductalcarcinoma in situ, stage I-III cancer and metastatic lymph nodes inorder to permit the identification of associations betweenamplifications and deletions and stage. The associations made may makepossible effective therapeutic intervention. For example, consistentlyamplified regions may contain an overexpressed gene, the product ofwhich may be able to be attacked therapeutically (for example, thegrowth factor receptor tyrosine kinase, p185^(HER2)).

The method can be used to identify amplification and/or deletion eventsthat are associated with drug resistance by determining the copy numbervariation of nucleic acids from primary cancers to those of cells thathave metastasized to other sites. If gene amplification and/or deletionis a manifestation of karyotypic instability that allows rapiddevelopment of drug resistance, more amplification and/or deletion inprimary tumors from chemoresistant patients than in tumors inchemosensitive patients would be expected. For example, if amplificationof specific genes is responsible for the development of drug resistance,regions surrounding those genes would be expected to be amplifiedconsistently in tumor cells from pleural effusions of chemoresistantpatients but not in the primary tumors. Discovery of associationsbetween gene amplification and/or deletion and the development of drugresistance may allow the identification of patients that will or willnot benefit from adjuvant therapy.

Apparatus and Systems for Determining CNV

Analysis of the sequencing data and the diagnosis derived therefrom aretypically performed using various computer algorithms and programs. Inone embodiment, the invention provides a computer program product forgenerating an output indicating the presence or absence of a fetalaneuploidy in a test sample. The computer product comprises a computerreadable medium having a computer executable logic recorded thereon forenabling a processor to diagnose a fetal aneuploidy comprising: areceiving procedure for receiving sequencing data from at least aportion of nucleic acid molecules from a maternal biological sample,wherein said sequencing data comprises a calculated chromosome; computerassisted logic for analyzing a fetal aneuploidy from said received data;and an output procedure for generating an output indicating thepresence, absence or kind of said fetal aneuploidy. The method of theinvention can be performed using a computer-readable medium havingstored thereon computer-readable instructions for carrying out a methodfor identifying any CNV e.g. chromosomal or partial aneuploidies. In oneembodiment, the invention provides a computer-readable medium havingstored thereon computer-readable instructions for identifying achromosome suspected to be involved with a chromosomal aneuploidy e.g.trisomy 21, trisomy, 13, trisomy 18, or monosomy X.

In one embodiment, the invention provides a computer-readable mediumhaving stored thereon computer-readable instructions for carrying out amethod for identifying fetal trisomy 21, said method comprising thesteps: (a) obtaining sequence information for a plurality of fetal andmaternal nucleic acid molecules of a maternal plasma sample; (b) usingthe sequence information to identify a number of mapped sequence tagsfor chromosome 21; (c) using the sequence information to identify anumber of mapped sequence tags for at least one normalizing chromosome;(d) using the number of mapped sequence tags identified for chromosome21 in step (b) and the number of mapped sequence tags identified for theat least one normalizing chromosome in step (c) to calculate achromosome dose for chromosome 21; and (e) comparing said chromosomedose to at least one threshold value, and thereby identifying thepresence or absence of fetal trisomy 21. In one embodiment, step (d)comprises calculating a chromosome dose for chromosome 21 as the ratioof the number of mapped sequence tags identified for chromosome 21 andthe number of mapped sequence tags identified for the at least onenormalizing chromosome. Alternatively, step (d) (i) calculating asequence tag density ratio for chromosome 21, by relating the number ofmapped sequence tags identified for chromosome 21 in step (b) to thelength of chromosome 21; (ii) calculating a sequence tag density ratiofor said at least one normalizing chromosome, by relating the number ofmapped sequence tags identified for said at least one normalizingchromosome in step (c) to the length of said at least one normalizingchromosome; and (iii) using the sequence tag density ratios calculatedin steps (i) and (ii) to calculate a chromosome dose for chromosome 21,wherein the chromosome dose is calculated as the ratio of the sequencetag density ratio for chromosome 21 and the sequence tag density ratiofor said at least one normalizing chromosome. In one embodiment, the atleast one normalizing chromosome is selected from the group ofchromosome 9, chromosome 1, chromosome 2, chromosome 3, chromosome 4,chromosome 5, chromosome 6, chromosome 7, chromosome 8, chromosome 10,chromosome 11, chromosome 12, chromosome 13, chromosome 14, chromosome15, chromosome 16, and chromosome 17. Preferably, the at least onenormalizing chromosome is selected from the group of chromosome 9,chromosome 1, chromosome 2, chromosome 11, chromosome 12, and chromosome14. In one embodiment, the fetal and maternal nucleic acid molecules arecell-free DNA molecules. In some embodiments, the sequencing method foridentifying the fetal trisomy 21 is a next generation sequencing method.In some embodiments, the sequencing method is a massively parallelsequencing method that uses sequencing-by-synthesis,sequencing-by-ligation, or pyrosequencing. Preferably, the sequencingmethod is massively parallel sequencing-by-synthesis using reversibledye terminators. In other embodiments, the sequencing method is Sangersequencing. In some embodiments, the sequencing method comprises anamplification e.g. a PCR amplification. In some embodiments, thecomputer-readable medium having stored thereon computer-readableinstructions for identifying fetal trisomy 21 carries out a methodcomprising the steps of (a) using sequence information obtained from aplurality of fetal and maternal nucleic acid molecules in a maternalplasma sample to identify a number of mapped sequence tags forchromosome 21; (b) using sequence information obtained from a pluralityof fetal and maternal nucleic acid molecules in a maternal plasma sampleto identify a number of mapped sequence tags for at least onenormalizing chromosome; (c) using the number of mapped sequence tagsidentified for chromosome 21 in step (a) and the number of mappedsequence tags identified for the at least one normalizing chromosome instep (b) to calculate a chromosome dose for chromosome 21; and (d)comparing said chromosome dose to at least one threshold value, andthereby identifying the presence or absence of fetal trisomy 21.

In one embodiment, the invention provides a computer-readable mediumhaving stored thereon computer-readable instructions for carrying out amethod for identifying fetal trisomy 21 in a maternal plasma samplecomprising fetal and maternal nucleic acid molecules, and comprises thesteps: (a) sequencing at least a portion of said nucleic acid molecules,thereby obtaining sequence information for a plurality of fetal andmaternal nucleic acid molecules of a maternal plasma sample; (b) usingthe sequence information to identify a number of mapped sequence tagsfor chromosome 21; (c) using the sequence information to identify anumber of mapped sequence tags for at least one normalizing chromosome;(d) using the number of mapped sequence tags identified for chromosome21 in step (b) and the number of mapped sequence tags identified for theat least one normalizing chromosome in step (c) to calculate achromosome dose for chromosome 21; and (e) comparing said chromosomedose to at least one threshold value, and thereby identifying thepresence or absence of fetal trisomy 21. In one embodiment, step (d)step (d) comprises calculating a chromosome dose for chromosome 21 asthe ratio of the number of mapped sequence tags identified forchromosome 21 and the number of mapped sequence tags identified for theat least one normalizing chromosome. Alternatively, step (d) (i)calculating a sequence tag density ratio for chromosome 21, by relatingthe number of mapped sequence tags identified for chromosome 21 in step(b) to the length of chromosome 21; (ii) calculating a sequence tagdensity ratio for said at least one normalizing chromosome, by relatingthe number of mapped sequence tags identified for said at least onenormalizing chromosome in step (c) to the length of said at least onenormalizing chromosome; and (iii) using the sequence tag density ratioscalculated in steps (i) and (ii) to calculate a chromosome dose forchromosome 21, wherein the chromosome dose is calculated as the ratio ofthe sequence tag density ratio for chromosome 21 and the sequence tagdensity ratio for said at least one normalizing chromosome. In oneembodiment, the at least one normalizing chromosome is selected from thegroup chromosome 9, chromosome 1, chromosome 2, chromosome 3, chromosome4, chromosome 5, chromosome 6, chromosome 7, chromosome 8, chromosome10, chromosome 11, chromosome 12, chromosome 13, chromosome 14,chromosome 15, chromosome 16, and chromosome 17. Preferably, the atleast one normalizing chromosome is selected from the group ofchromosome 9, chromosome 1, chromosome 2, chromosome 11, chromosome 12,and chromosome 14. In one embodiment, the fetal and maternal nucleicacid molecules are cell-free DNA molecules. In some embodiments, thesequencing method for identifying the fetal trisomy 21 is a nextgeneration sequencing method. In some embodiments, the sequencing methodis a massively parallel sequencing method that usessequencing-by-synthesis with reversible dye terminators. In otherembodiments, the sequencing method is sequencing-by-ligation. In someembodiments, sequencing comprises an amplification. In one embodiment,the computer-readable medium having stored thereon computer-readableinstructions for identifying fetal trisomy 21 carries out a methodcomprising the steps of (a) using sequence information obtained from aplurality of fetal and maternal nucleic acid molecules in a maternalplasma sample to identify a number of mapped sequence tags forchromosome 21; (b) using sequence information obtained from a pluralityof fetal and maternal nucleic acid molecules in a maternal plasma sampleto identify a number of mapped sequence tags for at least onenormalizing chromosome; (c) using the number of mapped sequence tagsidentified for chromosome 21 in step (a) and the number of mappedsequence tags identified for the at least one normalizing chromosome instep (b) to calculate a chromosome dose for chromosome 21; and (d)comparing said chromosome dose to at least one threshold value, andthereby identifying the presence or absence of fetal trisomy 21. Thecomputer-readable medium can be used for identifying other fetaltrisomies e.g. trisomy 13, trisomy 18, trisomy 21, and chromosomalmonosomies e.g. monosomy X.

In another embodiment, a computer-readable medium having stored thereoncomputer-readable instructions is provided for carrying out a method foridentifying fetal trisomy 18 in a maternal plasma sample comprisingfetal and maternal nucleic acid molecules, according to the methoddescribed for trisomy 21 wherein the normalizing chromosome foridentifying trisomy 18 is selected from chromosome 8, chromosome 2,chromosome 3, chromosome 4, chromosome 5, chromosome 6, chromosome 7,chromosome 9, chromosome 10, chromosome 11, chromosome 12, chromosome13, and chromosome 14. Preferably, the normalizing chromosome foridentifying trisomy 18 is selected from chromosome 8, chromosome 2,chromosome 3, chromosome 5, chromosome 6, chromosome 12, and chromosome14.

In another embodiment, a computer-readable medium having stored thereoncomputer-readable instructions is provided for carrying out a method foridentifying fetal trisomy 13 in a maternal plasma sample comprisingfetal and maternal nucleic acid molecules, according to the methoddescribed for trisomy 21 wherein the normalizing chromosome foridentifying trisomy 13 is selected from chromosome 2, chromosome 3,chromosome 4, chromosome 5, chromosome 6, chromosome 7, chromosome 8,chromosome 9, chromosome 10, chromosome 11, chromosome 12, chromosome14, chromosome 18, and chromosome 21. In some embodiments, the at leastone normalizing chromosome is a group of chromosomes selected fromchromosome 2, chromosome 3, chromosome 4, chromosome 5, chromosome 6,and chromosome 8. Preferably, the normalizing chromosome for identifyingtrisomy 13 is a group of chromosomes selected from chromosome 2,chromosome 3, chromosome 4, chromosome 5 and chromosome 6.

In another embodiment, a computer-readable medium having stored thereoncomputer-readable instructions is provided for carrying out a method foridentifying fetal monosomy X in a maternal plasma sample comprisingfetal and maternal nucleic acid molecules, according to the methoddescribed for monosomy X wherein the normalizing chromosome foridentifying monosomy X is selected from chromosome 1, chromosome 2,chromosome 3, chromosome 4, chromosome 5, chromosome 6, chromosome 7,chromosome 8, chromosome 9, chromosome 10, chromosome 11, chromosome 12,chromosome 13, chromosome 14, chromosome 15, and chromosome 16.Preferably, the normalizing chromosome is selected from chromosome 2,chromosome 3, chromosome 4, chromosome 5, chromosome 6, and chromosome8. Alternatively, the normalizing chromosome is a group of chromosomesselected from chromosome 2, chromosome 3, chromosome 4, chromosome 5,chromosome 6, and chromosome 8. In one embodiment, the method foridentifying fetal monosomy X further comprises determining the presenceor absence of chromosome Y, comprising the steps: (a) using the sequenceinformation to identify a number of mapped sequence tags for chromosomeY; (b) using the sequence information to identify a number of mappedsequence tags for at least one normalizing chromosome; (c) using thenumber of mapped sequence tags identified for chromosome Y in step (a)and the number of mapped sequence tags identified for the at least onenormalizing chromosome in step (b) to calculate a chromosome dose forchromosome Y; and (d) comparing said chromosome dose to at least onethreshold value, and thereby identifying the presence or absence offetal chromosome Y. In one embodiment, obtaining the sequenceinformation comprises sequencing at least a portion of said nucleic acidmolecules, thereby obtaining sequence information for a plurality offetal and maternal nucleic acid molecules of a maternal plasma sample.In one embodiment, step (c) comprises calculating a chromosome dose forchromosome Y as the ratio of the number of mapped sequence tagsidentified for chromosome Y and the number of mapped sequence tagsidentified for the at least one normalizing chromosome. Alternatively,step (c) comprises (i) calculating a sequence tag density ratio forchromosome Y, by relating the number of mapped sequence tags identifiedfor chromosome Y in step (a) to the length of chromosome Y; (ii)calculating a sequence tag density ratio for said at least onenormalizing chromosome, by relating the number of mapped sequence tagsidentified for said at least one normalizing chromosome in step (b) tothe length of said at least one normalizing chromosome; and (iii) usingthe sequence tag density ratios calculated in steps (i) and (ii) tocalculate a chromosome dose for chromosome Y, wherein the chromosomedose is calculated as the ratio of the sequence tag density ratio forchromosome Y and the sequence tag density ratio for said at least onenormalizing chromosome. Any one chromosome, or a group of two or morechromosomes selected from chromosomes 1-22 and chromosome X can be usedas the normalizing chromosome for chromosome Y. In one embodiment, theat least one normalizing chromosome is a group of chromosomes consistingof chromosomes 1-22, and chromosome X.

The method of the invention can be performed using a computer processingsystem which is adapted or configured to perform a method foridentifying any CNV e.g. chromosomal or partial aneuploidies. In oneembodiment, the invention provides a computer processing system which isadapted or configured to perform a method for identifying fetal trisomy21, said method comprising the steps: (a) obtaining sequence informationfor a plurality of fetal and maternal nucleic acid molecules of amaternal plasma sample; (b) using the sequence information to identify anumber of mapped sequence tags for chromosome 21; (c) using the sequenceinformation to identify a number of mapped sequence tags for at leastone normalizing chromosome; (d) using the number of mapped sequence tagsidentified for chromosome 21 in step (b) and the number of mappedsequence tags identified for the at least one normalizing chromosome instep (c) to calculate a chromosome dose for chromosome 21; and (e)comparing said chromosome dose to at least one threshold value, andthereby identifying the presence or absence of fetal trisomy 21. In oneembodiment, step (d) comprises calculating a chromosome dose forchromosome 21 as the ratio of the number of mapped sequence tagsidentified for chromosome 21 and the number of mapped sequence tagsidentified for the at least one normalizing chromosome. Alternatively,step (d) (i) calculating a sequence tag density ratio for chromosome 21,by relating the number of mapped sequence tags identified for chromosome21 in step (b) to the length of chromosome 21; (ii) calculating asequence tag density ratio for said at least one normalizing chromosome,by relating the number of mapped sequence tags identified for said atleast one normalizing chromosome in step (c) to the length of said atleast one normalizing chromosome; and (iii) using the sequence tagdensity ratios calculated in steps (i) and (ii) to calculate achromosome dose for chromosome 21, wherein the chromosome dose iscalculated as the ratio of the sequence tag density ratio for chromosome21 and the sequence tag density ratio for said at least one normalizingchromosome. In one embodiment, the at least one normalizing chromosomeis selected from the group consisting of chromosome 9, chromosome 1,chromosome 2, chromosome 3, chromosome 4, chromosome 5, chromosome 6,chromosome 7, chromosome 8, chromosome 10, chromosome 11, chromosome 12,chromosome 13, chromosome 14, chromosome 15, chromosome 16, andchromosome 17. In one embodiment, the fetal and maternal nucleic acidmolecules are cell-free DNA molecules. In some embodiments, thesequencing method for identifying the fetal trisomy 21 is a nextgeneration sequencing method. In some embodiments, the sequencing methodis a massively parallel sequencing method that usessequencing-by-synthesis with reversible dye terminators. In otherembodiments, the sequencing method is sequencing-by-ligation. In someembodiments, sequencing comprises an amplification. In one embodiment, acomputer processing system that is adapted or configured for carryingout a method comprising the steps of: (a) using sequence informationobtained from a plurality of fetal and maternal nucleic acid moleculesin a maternal plasma sample to identify a number of mapped sequence tagsfor chromosome 21; (b) using sequence information obtained from aplurality of fetal and maternal nucleic acid molecules in a maternalplasma sample to identify a number of mapped sequence tags for at leastone normalizing chromosome; (c) using the number of mapped sequence tagsidentified for chromosome 21 in step (a) and the number of mappedsequence tags identified for the at least one normalizing chromosome instep (b) to calculate a chromosome dose for chromosome 21; and (d)comparing said chromosome dose to at least one threshold value, andthereby identifying the presence or absence of fetal trisomy 21.

In one embodiment, the invention provides a computer processing systemthat is adapted or configured to perform a method for identifying fetaltrisomy 21 in a maternal plasma sample comprising fetal and maternalnucleic acid molecules, and comprises the steps: (a) sequencing at leasta portion of said nucleic acid molecules, thereby obtaining sequenceinformation for a plurality of fetal and maternal nucleic acid moleculesof a maternal plasma sample; (b) using the sequence information toidentify a number of mapped sequence tags for chromosome 21; (c) usingthe sequence information to identify a number of mapped sequence tagsfor at least one normalizing chromosome; (d) using the number of mappedsequence tags identified for chromosome 21 in step (b) and the number ofmapped sequence tags identified for the at least one normalizingchromosome in step (c) to calculate a chromosome dose for chromosome 21;and (e) comparing said chromosome dose to at least one threshold value,and thereby identifying the presence or absence of fetal trisomy 21. Inone embodiment, step (d) step (d) comprises calculating a chromosomedose for chromosome 21 as the ratio of the number of mapped sequencetags identified for chromosome 21 and the number of mapped sequence tagsidentified for the at least one normalizing chromosome. Alternatively,step (d) (i) calculating a sequence tag density ratio for chromosome 21,by relating the number of mapped sequence tags identified for chromosome21 in step (b) to the length of chromosome 21; (ii) calculating asequence tag density ratio for said at least one normalizing chromosome,by relating the number of mapped sequence tags identified for said atleast one normalizing chromosome in step (c) to the length of said atleast one normalizing chromosome; and (iii) using the sequence tagdensity ratios calculated in steps (i) and (ii) to calculate achromosome dose for chromosome 21, wherein the chromosome dose iscalculated as the ratio of the sequence tag density ratio for chromosome21 and the sequence tag density ratio for said at least one normalizingchromosome. In one embodiment, the at least one normalizing chromosomeis selected from the group consisting of chromosome 9, chromosome 1,chromosome 2, chromosome 3, chromosome 4, chromosome 5, chromosome 6,chromosome 7, chromosome 8, chromosome 10, chromosome 11, chromosome 12,chromosome 13, chromosome 14, chromosome 15, chromosome 16, andchromosome 17. In one embodiment, the fetal and maternal nucleic acidmolecules are cell-free DNA molecules. In some embodiments, thesequencing method for identifying the fetal trisomy 21 is a nextgeneration sequencing method. In some embodiments, the sequencing methodis a massively parallel sequencing method that usessequencing-by-synthesis with reversible dye terminators. In otherembodiments, the sequencing method is sequencing-by-ligation. In someembodiments, sequencing comprises an amplification. In one embodiment, acomputer processing system is adapted or configured for carrying out amethod comprising the steps of: (a) using sequence information obtainedfrom a plurality of fetal and maternal nucleic acid molecules in amaternal plasma sample to identify a number of mapped sequence tags forchromosome 21; (b) using sequence information obtained from a pluralityof fetal and maternal nucleic acid molecules in a maternal plasma sampleto identify a number of mapped sequence tags for at least onenormalizing chromosome; (c) using the number of mapped sequence tagsidentified for chromosome 21 in step (a) and the number of mappedsequence tags identified for the at least one normalizing chromosome instep (b) to calculate a chromosome dose for chromosome 21; and (d)comparing said chromosome dose to at least one threshold value, andthereby identifying the presence or absence of fetal trisomy 21.

In another embodiment, the computer processing system is adapted orconfigured for identifying fetal trisomy 18 in a maternal plasma samplecomprising fetal and maternal nucleic acid molecules, according to themethod described for trisomy 21 wherein the normalizing chromosome foridentifying trisomy 18 is selected from trisomy 18 wherein thenormalizing chromosome for identifying trisomy 18 is selected fromchromosome 8, chromosome 2, chromosome 3, chromosome 4, chromosome 5,chromosome 6, chromosome 7, chromosome 9, chromosome 10, chromosome 11,chromosome 12, chromosome 13, and chromosome 14.

In another embodiment, the computer processing system is adapted orconfigured for identifying fetal trisomy 13 in a maternal plasma samplecomprising fetal and maternal nucleic acid molecules, according to themethod described for trisomy 21 wherein the normalizing chromosome foridentifying trisomy 13 is selected from chromosome 2, chromosome 3,chromosome 4, chromosome 5, chromosome 6, chromosome 7, chromosome 8,chromosome 9, chromosome 10, chromosome 11, chromosome 12, chromosome14, chromosome 18, and chromosome 21. Preferably, the normalizingchromosome for identifying trisomy 13 is a combination of a group ofchromosomes consisting of chromosome 2, chromosome 3, chromosome 4,chromosome 5 and chromosome 6.

In another embodiment, the computer processing system is adapted orconfigured for identifying fetal monosomy X in a maternal plasma samplecomprising fetal and maternal nucleic acid molecules, according to themethod described for trisomy 21 wherein the normalizing chromosome foridentifying monosomy X is selected from chromosome 1, chromosome 2,chromosome 3, chromosome 4, chromosome 5, chromosome 6, chromosome 7,chromosome 8, chromosome 9, chromosome 10, chromosome 11, chromosome 12,chromosome 13, chromosome 14, chromosome 15, and chromosome 16. In oneembodiment, the method for identifying fetal monosomy X furthercomprises determining the presence or absence of chromosome Y,comprising the steps: (a) using the sequence information to identify anumber of mapped sequence tags for chromosome Y; (b) using the sequenceinformation to identify a number of mapped sequence tags for at leastone normalizing chromosome; (c) using the number of mapped sequence tagsidentified for chromosome Y in step (a) and the number of mappedsequence tags identified for the at least one normalizing chromosome instep (b) to calculate a chromosome dose for chromosome Y; and (d)comparing said chromosome dose to at least one threshold value, andthereby identifying the presence or absence of fetal chromosome Y. Inone embodiment, obtaining the sequence information comprises sequencingat least a portion of said nucleic acid molecules, thereby obtainingsequence information for a plurality of fetal and maternal nucleic acidmolecules of a maternal plasma sample. In one embodiment, step (c)comprises calculating a chromosome dose for chromosome Y as the ratio ofthe number of mapped sequence tags identified for chromosome Y and thenumber of mapped sequence tags identified for the at least onenormalizing chromosome. Alternatively, step (c) comprises (i)calculating a sequence tag density ratio for chromosome Y, by relatingthe number of mapped sequence tags identified for chromosome Y in step(a) to the length of chromosome Y; (ii) calculating a sequence tagdensity ratio for said at least one normalizing chromosome, by relatingthe number of mapped sequence tags identified for said at least onenormalizing chromosome in step (b) to the length of said at least onenormalizing chromosome; and (iii) using the sequence tag density ratioscalculated in steps (i) and (ii) to calculate a chromosome dose forchromosome Y, wherein the chromosome dose is calculated as the ratio ofthe sequence tag density ratio for chromosome Y and the sequence tagdensity ratio for said at least one normalizing chromosome. Any onechromosome, or a group of two or more chromosomes selected fromchromosomes 1-22 and chromosome X can be used as the normalizingchromosome for chromosome Y. In one embodiment, the at least onenormalizing chromosome is a group of chromosomes consisting ofchromosomes 1-22, and chromosome X.

In one embodiment, the invention provides an apparatus that is adaptedand configured to perform a method of identifying a CNV e.g. achromosomal or a partial aneuploidy, as described herein. In oneembodiment, the apparatus is configured to perform a method for identifyfetal trisomy 21 comprising: (a) obtaining sequence information for aplurality of fetal and maternal nucleic acid molecules of a maternalplasma sample; (b) using the sequence information to identify a numberof mapped sequence tags for chromosome 21; (c) using the sequenceinformation to identify a number of mapped sequence tags for at leastone normalizing chromosome; (d) using the number of mapped sequence tagsidentified for chromosome 21 in step (b) and the number of mappedsequence tags identified for the at least one normalizing chromosome instep (c) to calculate a chromosome dose for chromosome 21; and (e)comparing said chromosome dose to at least one threshold value, andthereby identifying the presence or absence of fetal trisomy 21. In oneembodiment, step (d) comprises calculating a chromosome dose forchromosome 21 as the ratio of the number of mapped sequence tagsidentified for chromosome 21 and the number of mapped sequence tagsidentified for the at least one normalizing chromosome. Alternatively,step (d) (i) calculating a sequence tag density ratio for chromosome 21,by relating the number of mapped sequence tags identified for chromosome21 in step (b) to the length of chromosome 21; (ii) calculating asequence tag density ratio for said at least one normalizing chromosome,by relating the number of mapped sequence tags identified for said atleast one normalizing chromosome in step (c) to the length of said atleast one normalizing chromosome; and (iii) using the sequence tagdensity ratios calculated in steps (i) and (ii) to calculate achromosome dose for chromosome 21, wherein the chromosome dose iscalculated as the ratio of the sequence tag density ratio for chromosome21 and the sequence tag density ratio for said at least one normalizingchromosome. In one embodiment, the at least one normalizing chromosomeis selected from the group consisting of chromosome 9, chromosome 1,chromosome 2, chromosome 3, chromosome 4, chromosome 5, chromosome 6,chromosome 7, chromosome 8, chromosome 10, chromosome 11, chromosome 12,chromosome 13, chromosome 14, chromosome 15, chromosome 16, andchromosome 17. In one embodiment, the fetal and maternal nucleic acidmolecules are cell-free DNA molecules. In some embodiments, thesequencing method for identifying the fetal trisomy 21 is a nextgeneration sequencing method. In some embodiments, the sequencing methodis a massively parallel sequencing method that usessequencing-by-synthesis with reversible dye terminators. In otherembodiments, the sequencing method is sequencing-by-ligation. In someembodiments, sequencing comprises an amplification. In one embodiment,the apparatus that is configured to identify fetal trisomy 21 comprises(a) a sequencing device adapted or configured for sequencing at least aportion of the nucleic acid molecules in a maternal plasma samplecomprising fetal and maternal nucleic acid molecules, thereby generatingsequence information; and (b) a computer processing system configured toperform the following steps: (i) using sequence information generated bythe sequencing device to identify a number of mapped sequence tags forchromosome 21; (ii) using sequence information generated by thesequencing device to identify a number of mapped sequence tags for atleast one normalizing chromosome; (iii) using the number of mappedsequence tags identified for chromosome 21 in step (i) and the number ofmapped sequence tags identified for the at least one normalizingchromosome in step (ii) to calculate a chromosome dose for chromosome21; and (iv) comparing said chromosome dose to at least one thresholdvalue, and thereby identifying the presence or absence of fetal trisomy21.

In one embodiment, an apparatus is provided that is adapted orconfigured to perform a method for identifying fetal trisomy 21 in amaternal plasma sample comprising fetal and maternal nucleic acidmolecules, which method comprises (a) sequencing at least a portion ofsaid nucleic acid molecules, thereby obtaining sequence information fora plurality of fetal and maternal nucleic acid molecules of a maternalplasma sample; (b) using the sequence information to identify a numberof mapped sequence tags for chromosome 21; (c) using the sequenceinformation to identify a number of mapped sequence tags for at leastone normalizing chromosome; (d) using the number of mapped sequence tagsidentified for chromosome 21 in step (b) and the number of mappedsequence tags identified for the at least one normalizing chromosome instep (c) to calculate a chromosome dose for chromosome 21; and (e)comparing said chromosome dose to at least one threshold value, andthereby identifying the presence or absence of fetal trisomy 21. In oneembodiment, step (d) comprises calculating a chromosome dose forchromosome 21 as the ratio of the number of mapped sequence tagsidentified for chromosome 21 and the number of mapped sequence tagsidentified for the at least one normalizing chromosome. Alternatively,step (d) comprises (i) calculating a sequence tag density ratio forchromosome 21, by relating the number of mapped sequence tags identifiedfor chromosome 21 in step (b) to the length of chromosome 21; (ii)calculating a sequence tag density ratio for said at least onenormalizing chromosome, by relating the number of mapped sequence tagsidentified for said at least one normalizing chromosome in step (c) tothe length of said at least one normalizing chromosome; and (iii) usingthe sequence tag density ratios calculated in steps (i) and (ii) tocalculate a chromosome dose for chromosome 21, wherein the chromosomedose is calculated as the ratio of the sequence tag density ratio forchromosome 21 and the sequence tag density ratio for said at least onenormalizing chromosome. The at least one normalizing chromosome isselected form the group of chromosome 9, chromosome 1, chromosome 10,chromosome 11 and chromosome 15. In one embodiment, the fetal andmaternal nucleic acid molecules are cell-free DNA molecules. In someembodiments, the sequencing method for identifying the fetal trisomy 21is a next generation sequencing method. In some embodiments, thesequencing method is a massively parallel sequencing method that usessequencing-by-synthesis with reversible dye terminators. In otherembodiments, the sequencing method is sequencing-by-ligation. In someembodiments, sequencing comprises an amplification. In some embodiments,sequencing comprises PCR amplification. In one embodiment, theapparatus, which is adapted or configured for identifying fetal trisomy21 in a maternal plasma sample comprising fetal and maternal nucleicacid molecules, comprises: (a) a sequencing device adapted or configuredfor sequencing at least a portion of the nucleic acid molecules in amaternal plasma sample comprising fetal and maternal nucleic acidmolecules, thereby generating sequence information; and (b) a computerprocessing system configured to perform the following steps: (i) usingsequence information generated by the sequencing device to identify anumber of mapped sequence tags for chromosome 21; (ii) using sequenceinformation generated by the sequencing device to identify a number ofmapped sequence tags for at least one normalizing chromosome; (iii)using the number of mapped sequence tags identified for chromosome 21 instep (i) and the number of mapped sequence tags identified for the atleast one normalizing chromosome in step (ii) to calculate a chromosomedose for chromosome 21; and (iv) comparing said chromosome dose to atleast one threshold value, and thereby identifying the presence orabsence of fetal trisomy 21.

In another embodiment, the apparatus is adapted or configured foridentifying fetal trisomy 18 in a maternal plasma sample comprisingfetal and maternal nucleic acid molecules, according to the methoddescribed for trisomy 21 wherein the normalizing chromosome foridentifying trisomy 18 is selected from chromosome 8, chromosome 2,chromosome 3, chromosome 4, chromosome 5, chromosome 6, chromosome 7,chromosome 9, chromosome 10, chromosome 11, chromosome 12, chromosome13, and chromosome 14.

In another embodiment, the apparatus is adapted or configured foridentifying fetal trisomy 13 in a maternal plasma sample comprisingfetal and maternal nucleic acid molecules, according to the methoddescribed for trisomy 21 wherein the normalizing chromosome foridentifying trisomy 13 is selected from chromosome 2, chromosome 3,chromosome 4, chromosome 5, chromosome 6, chromosome 7, chromosome 8,chromosome 9, chromosome 10, chromosome 11, chromosome 12, chromosome14, chromosome 18, and chromosome 21. Preferably, the normalizingchromosome for identifying trisomy 13 is a combination of a group ofchromosomes consisting of chromosome 2, chromosome 3, chromosome 4,chromosome 5 and chromosome 6.

In another embodiment, the apparatus is adapted or configured foridentifying fetal monosomy X in a maternal plasma sample comprisingfetal and maternal nucleic acid molecules, according to the methoddescribed for identifying trisomy 21 wherein the normalizing chromosomefor identifying monosomy X is selected from chromosome 1, chromosome 2,chromosome 3, chromosome 4, chromosome 5, chromosome 6, chromosome 7,chromosome 8, chromosome 9, chromosome 10, chromosome 11, chromosome 12,chromosome 13, chromosome 14, chromosome 15, and chromosome 16. In oneembodiment, the method for identifying fetal monosomy X furthercomprises determining the presence or absence of chromosome Y,comprising the steps: (a) using the sequence information to identify anumber of mapped sequence tags for chromosome Y; (b) using the sequenceinformation to identify a number of mapped sequence tags for at leastone normalizing chromosome; (c) using the number of mapped sequence tagsidentified for chromosome Y in step (a) and the number of mappedsequence tags identified for the at least one normalizing chromosome instep (b) to calculate a chromosome dose for chromosome Y; and (d)comparing said chromosome dose to at least one threshold value, andthereby identifying the presence or absence of fetal chromosome Y. Inone embodiment, obtaining the sequence information comprises sequencingat least a portion of said nucleic acid molecules, thereby obtainingsequence information for a plurality of fetal and maternal nucleic acidmolecules of a maternal plasma sample. In one embodiment, step (c)comprises calculating a chromosome dose for chromosome Y as the ratio ofthe number of mapped sequence tags identified for chromosome Y and thenumber of mapped sequence tags identified for the at least onenormalizing chromosome. Alternatively, step (c) comprises (i)calculating a sequence tag density ratio for chromosome Y, by relatingthe number of mapped sequence tags identified for chromosome Y in step(a) to the length of chromosome Y; (ii) calculating a sequence tagdensity ratio for said at least one normalizing chromosome, by relatingthe number of mapped sequence tags identified for said at least onenormalizing chromosome in step (b) to the length of said at least onenormalizing chromosome; and (iii) using the sequence tag density ratioscalculated in steps (i) and (ii) to calculate a chromosome dose forchromosome Y, wherein the chromosome dose is calculated as the ratio ofthe sequence tag density ratio for chromosome Y and the sequence tagdensity ratio for said at least one normalizing chromosome. Any onechromosome, or a group of two or more chromosomes selected fromchromosomes 1-22 and chromosome X can be used as the normalizingchromosome for chromosome Y. In one embodiment, the at least onenormalizing chromosome is a group of chromosomes consisting ofchromosomes 1-22, and chromosome X.

The present invention is described in further detail in the followingExamples which are not in any way intended to limit the scope of theinvention as claimed. The attached Figures are meant to be considered asintegral parts of the specification and description of the invention.The following examples are offered to illustrate, but not to limit theclaimed invention.

7. EXPERIMENTAL Example 1 Sample Processing and DNA Extraction

Peripheral blood samples were collected from pregnant women in theirfirst or second trimester of pregnancy and who were deemed at risk forfetal aneuploidy. Informed consent was obtained from each participantprior to the blood draw. Blood was collected before amniocentesis orchorionic villus sampling. Karyotype analysis was performed using thechorionic villus or amniocentesis samples to confirm fetal karyotype.

Peripheral blood drawn from each subject was collected in ACD tubes. Onetube of blood sample (approximately 6-9 mL/tube) was transferred intoone 15-mL low speed centrifuge tube. Blood was centrifuged at 2640 rpm,4° C. for 10 min using Beckman Allegra 6 R centrifuge and rotor model GA3.8.

For cell-free plasma extraction, the upper plasma layer was transferredto a 15-ml high speed centrifuge tube and centrifuged at 16000×g, 4° C.for 10 min using Beckman Coulter Avanti J-E centrifuge, and JA-14 rotor.The two centrifugation steps were performed within 72 h after bloodcollection. Cell-free plasma was stored at −80° C. and thawed only oncebefore DNA extraction.

Cell-free DNA was extracted from cell-free plasma by using QIAamp DNABlood Mini kit (Qiagen) according to the manufacturer's instructions.Five milliliters of buffer AL and 500 μl of Qiagen Protease were addedto 4.5 ml-5 ml of cell-free plasma. The volume was adjusted to 10 mlwith phosphate buffered saline (PBS), and the mixture was incubated at56° C. for 12 minutes. Multiple columns were used to separate theprecipitated cfDNA from the solution by centrifugation at 8,000 RPM in aBeckman microcentrifuge. The columns were washed with AW1 and AW2buffers, and the cfDNA was eluted with 55 μl of nuclease-free water.Approximately 3.5-7 ng of cfDNA was extracted from the plasma samples.

All sequencing libraries were prepared from approximately 2 ng ofpurified cfDNA that was extracted from maternal plasma. Librarypreparation was performed using reagents of the NEBNext™ DNA Sample PrepDNA Reagent Set 1 (Part No. E6000L; New England Biolabs, Ipswich,Mass.), for Illumina® as follows. Because cell-free plasma DNA isfragmented in nature, no further fragmentation by nebulization orsonication was done on the plasma DNA samples. The overhangs ofapproximately 2 ng purified cfDNA fragments contained in 40 μl wereconverted into phosphorylated blunt ends according to the NEBNext® EndRepair Module by incubating in a 1.5 ml microfuge tube the cfDNA with 5μl 10× phosphorylation buffer, 2 μl deoxynucleotide solution mix (10 mMeach dNTP), 1 μl of a 1:5 dilution of DNA Polymerase I, 1 μl T4 DNAPolymerase and 1 μl T4 Polynucleotide Kinase provided in the NEBNext™DNA Sample Prep DNA Reagent Set 1 for 15 minutes at 20° C. The enzymeswere then heat inactivated by incubating the reaction mixture at 75° C.for 5 minutes. The mixture was cooled to 4° C., and dA tailing of theblunt-ended DNA was accomplished using 10 μl of the dA-tailing mastermix containing the Klenow fragment (3′ to 5′ exo minus) (NEBNext™ DNASample Prep DNA Reagent Set 1), and incubating for 15 minutes at 37° C.Subsequently, the Klenow fragment was heat inactivated by incubating thereaction mixture at 75° C. for 5 minutes. Following the inactivation ofthe Klenow fragment, 1 μl of a 1:5 dilution of Illumina Genomic AdaptorOligo Mix (Part No. 1000521; Illumina Inc., Hayward, Calif.) was used toligate the Illumina adaptors (Non-Index Y-Adaptors) to the dA-tailed DNAusing 4 μl of the T4 DNA ligase provided in the NEBNext™ DNA Sample PrepDNA Reagent Set 1, by incubating the reaction mixture for 15 minutes at25° C. The mixture was cooled to 4° C., and the adaptor-ligated cfDNAwas purified from unligated adaptors, adaptor dimers, and other reagentsusing magnetic beads provided in the Agencourt AMPure XP PCRpurification system (Part No. A63881; Beckman Coulter Genomics, Danvers,Mass.). Eighteen cycles of PCR were performed to selectively enrichadaptor-ligated cfDNA using Phusion® High-Fidelity Master Mix(Finnzymes, Woburn, Mass.) and Illumina's PCR primers complementary tothe adaptors (Part No. 1000537 and 1000537). The adaptor-ligated DNA wassubjected to PCR (98° C. for 30 seconds; 18 cycles of 98° C. for 10seconds, 65° C. for 30 seconds, and 72° C. for 30 seconds; finalextension at 72° C. for 5 minutes, and hold at 4° C.) using IlluminaGenomic PCR Primers (Part Nos. 100537 and 1000538) and the Phusion HFPCR Master Mix provided in the NEBNext™ DNA Sample Prep DNA Reagent Set1, according to the manufacturer's instructions. The amplified productwas purified using the Agencourt AMPure XP PCR purification system(Agencourt Bioscience Corporation, Beverly, Mass.) according to themanufacturer's instructions available atwww.beckmangenomics.com/products/AMPureXPProtocol_(—000387)v001.pdf. Thepurified amplified product was eluted in 40 μl of Qiagen EB Buffer, andthe concentration and size distribution of the amplified libraries wasanalyzed using the Agilent DNA 1000 Kit for the 2100 Bioanalyzer(Agilent technologies Inc., Santa Clara, Calif.).

The amplified DNA was sequenced using Illumina's Genome Analyzer II toobtain single-end reads of 36 bp. Only about 30 bp of random sequenceinformation are needed to identify a sequence as belonging to a specifichuman chromosome. Longer sequences can uniquely identify more particulartargets. In the present case, a large number of 36 by reads wereobtained, covering approximately 10% of the genome. Upon completion ofsequencing of the sample, the Illumina “Sequencer Control Software”transferred image and base call files to a Unix server running theIllumina “Genome Analyzer Pipeline” software version 1.51. The Illumina“Gerald” program was run to align sequences to the reference humangenome that is derived from the hg18 genome provided by National Centerfor Biotechnology Information (NCBI36/hg18, available on the world wideweb athttp://genome.ucsc.edu/cgi-bin/hgGateway?org=Human&db=hg18&hgsid=166260105).The sequence data generated from the above procedure that uniquelyaligned to the genome was read from Gerald output (export.txt files) bya program (c2c.p1) running on a computer running the Linnux operatingsystem. Sequence alignments with base mis-matches were allowed andincluded in alignment counts only if they aligned uniquely to thegenome. Sequence alignments with identical start and end coordinates(duplicates) were excluded.

Between about 5 and 15 million 36 by tags with 2 or less mismatches weremapped uniquely to the human genome. All mapped tags were counted andincluded in the calculation of chromosome doses in both test andqualifying samples. Regions extending from base 0 to base 2×10⁶, base10×10⁶ to base 13×10⁶, and base 23×10⁶ to the end of chromosome Y, werespecifically excluded from the analysis because tags derived from eithermale or female fetuses map to these regions of the Y-chromosome.

It was noted that some variation in the total number of sequence tagsmapped to individual chromosomes across samples sequenced in the samerun (inter-chromosomal variation), but substantially greater variationwas noted to occur among different sequencing runs (inter-sequencing runvariation).

Example 2 Dose and Variance for Chromosomes 13, 18, 21, X, and Y

To examine the extent of inter-chromosomal and inter-sequencingvariation in the number of mapped sequence tags for all chromosomes,plasma cfDNA obtained from peripheral blood of 48 volunteer pregnantsubjects was extracted and sequenced as described in Example 1, andanalyzed as follows.

The total number of sequence tags that were mapped to each chromosome(sequence tag density) was determined. Alternatively, the number ofmapped sequence tags may be normalized to the length of the chromosometo generate a sequence tag density ratio. The normalization tochromosome length is not a required step, and can be performed solely toreduce the number of digits in a number to simplify it for humaninterpretation. Chromosome lengths that can be used to normalize thesequence tags counts can be the lengths provided on the world wide webat genome.ucsc.edu/goldenPath/stats.html#hg18.

The resulting sequence tag density for each chromosome was related tothe sequence tag density of each of the remaining chromosomes to derivea qualified chromosome dose, which was calculated as the ratio of thesequence tag density for the chromosome of interest e.g. chromosome 21,and the sequence tag density of each of the remaining chromosomes i.e.chromosomes 1-20, 22 and X. Table 1 provides an example of thecalculated qualified chromosome dose for chromosomes of interest 13, 18,21, X, and Y, determined in one of the qualified samples. Chromosomesdoses were determined for all chromosomes in all samples, and theaverage doses for chromosomes of interest 13, 18, 21, X and Y in thequalified samples are provided in Tables 2 and 3, and depicted in FIGS.2-6. FIGS. 2-6 also depict the chromosome doses for the test samples.The chromosome doses for each of the chromosomes of interest in thequalified samples provides a measure of the variation in the totalnumber of mapped sequence tags for each chromosome of interest relativeto that of each of the remaining chromosomes. Thus, qualified chromosomedoses can identify the chromosome or a group of chromosomes i.e.normalizing chromosome, that has a variation among samples that isclosest to the variation of the chromosome of interest, and that wouldserve as ideal sequences for normalizing values for further statisticalevaluation. FIGS. 7 and 8 depict the calculated average chromosome dosesdetermined in a population of qualified samples for chromosomes 13, 18,and 21, and chromosomes X and Y.

In some instances, the best normalizing chromosome may not have theleast variation, but may have a distribution of qualified doses thatbest distinguishes a test sample or samples from the qualified samplesi.e. the best normalizing chromosome may not have the lowest variation,but may have the greatest differentiability. Thus, differentiabilityaccounts for the variation in chromosome dose and the distribution ofthe doses in the qualified samples.

Tables 2 and 3 provide the coefficient of variation as the measure ofvariability, and student t-test values as a measure of differentiabilityfor chromosomes 18, 21, X and Y, wherein the smaller the T-test value,the greatest the differentiability. The differentiability for chromosome13 was determined as the ratio of difference between the mean chromosomedose in the qualified samples and the dose for chromosome 13 in the onlyT13 test sample, and the standard deviation of mean of the qualifieddose.

The qualified chromosome doses also serve as the basis for determiningthreshold values when identifying aneuploidies in test samples asdescribed in the following.

TABLE 1 Qualified Chromosome Dose for Chromosomes 13, 18, 21, X and Y (n= 1; sample #11342, 46 XY) Chromo- some chr 21 chr 18 chr 13 chr X chrYchr1 0.149901 0.306798 0.341832 0.490969 0.003958 chr2 0.15413 0.3154520.351475 0.504819 0.004069 chr3 0.193331 0.395685 0.44087 0.6332140.005104 chr4 0.233056 0.476988 0.531457 0.763324 0.006153 chr5 0.2192090.448649 0.499882 0.717973 0.005787 chr6 0.228548 0.467763 0.5211790.748561 0.006034 chr7 0.245124 0.501688 0.558978 0.802851 0.006472 chr80.256279 0.524519 0.584416 0.839388 0.006766 chr9 0.309871 0.6342030.706625 1.014915 0.008181 chr10 0.25122 0.514164 0.572879 0.8228170.006633 chr11 0.257168 0.526338 0.586443 0.8423 0.00679 chr12 0.2751920.563227 0.627544 0.901332 0.007265 chr13 0.438522 0.897509 1 1.4362850.011578 chr14 0.405957 0.830858 0.925738 1.329624 0.010718 chr150.406855 0.832697 0.927786 1.332566 0.010742 chr16 0.376148 0.7698490.857762 1.231991 0.009931 chr17 0.383027 0.783928 0.873448 1.2545210.010112 chr18 0.488599 1 1.114194 1.600301 0.0129 chr19 0.5358671.096742 1.221984 1.755118 0.014148 chr20 0.467308 0.956424 1.0656421.530566 0.012338 chr21 1 2.046668 2.280386 3.275285 0.026401 chr220.756263 1.547819 1.724572 2.476977 0.019966 chrX 0.305317 0.6248820.696241 1 0.008061 chrY 37.87675 77.52114 86.37362 124.0572 1

TABLE 2 Qualified Chromosome Dose, Variance and Differentiability forchromosomes 21, 18 and 13 21 18 (n = 35) (n = 40) Avg Stdev CV T TestAvg Stdev CV T Test chr1 0.15335 0.001997 1.30 3.18E−10 0.31941 0.0083842.62 0.001675 chr2 0.15267 0.001966 1.29 9.87E−07 0.31807 0.001756 0.554.39E−05 chr3 0.18936 0.004233 2.24 1.04E−05 0.39475 0.002406 0.613.39E−05 chr4 0.21998 0.010668 4.85 0.000501 0.45873 0.014292 3.120.001349 chr5 0.21383 0.005058 2.37 1.43E−05 0.44582 0.003288 0.743.09E−05 chr6 0.22435 0.005258 2.34 1.48E−05 0.46761 0.003481 0.742.32E−05 chr7 0.24348 0.002298 0.94 2.05E−07 0.50765 0.004669 0.929.07E−05 chr8 0.25269 0.003497 1.38 1.52E−06 0.52677 0.002046 0.394.89E−05 chr9 0.31276 0.003095 0.99 3.83E−09 0.65165 0.013851 2.130.000559 chr10 0.25618 0.003112 1.21 2.28E−10 0.53354 0.013431 2.520.002137 chr11 0.26075 0.00247 0.95 1.08E−09 0.54324 0.012859 2.370.000998 chr12 0.27563 0.002316 0.84 2.04E−07 0.57445 0.006495 1.130.000125 chr13 0.41828 0.016782 4.01 0.000123 0.87245 0.020942 2.400.000164 chr14 0.40671 0.002994 0.74 7.33E−08 0.84731 0.010864 1.280.000149 chr15 0.41861 0.007686 1.84 1.85E−10 0.87164 0.027373 3.140.003862 chr16 0.39977 0.018882 4.72 7.33E−06 0.83313 0.050781 6.100.075458 chr17 0.41394 0.02313 5.59 0.000248 0.86165 0.060048 6.970.088579 chr18 0.47236 0.016627 3.52  1.3E−07 chr19 0.59435 0.05064 8.520.01494 1.23932 0.12315 9.94 0.231139 chr20 0.49464 0.021839 4.422.16E−06 1.03023 0.058995 5.73 0.061101 chr21 2.03419 0.08841 4.352.81E−05 chr22 0.84824 0.070613 8.32 0.02209 1.76258 0.169864 9.640.181808 chrX 0.27846 0.015546 5.58 0.000213 0.58691 0.026637 4.540.064883

TABLE 3 Qualified Chromosome Dose, Variance and Differentiability forchromosomes 13, X, and Y 13 (n = 47) X (n = 19) Avg Stdev CV Diff AvgStdev CV T Test chr1 0.36536 0.01775 4.86 1.904 0.56717 0.025988 4.580.001013 chr2 0.36400 0.009817 2.70 2.704 0.56753 0.014871 2.62  9.6E−08chr3 0.45168 0.007809 1.73 3.592 0.70524 0.011932 1.69 6.13E−11 chr40.52541 0.005264 1.00 3.083 0.82491 0.010537 1.28 1.75E−15 chr5 0.510100.007922 1.55 3.944 0.79690 0.012227 1.53 1.29E−11 chr6 0.53516 0.0085751.60 3.758 0.83594 0.013719 1.64 2.79E−11 chr7 0.58081 0.017692 3.052.445 0.90507 0.026437 2.92 7.41E−07 chr8 0.60261 0.015434 2.56 2.9170.93990 0.022506 2.39 2.11E−08 chr9 0.74559 0.032065 4.30 2.102 1.158220.047092 4.07 0.000228 chr10 0.61018 0.029139 4.78 2.060 0.947130.042866 4.53 0.000964 chr11 0.62133 0.028323 4.56 2.081 0.965440.041782 4.33 0.000419 chr12 0.65712 0.021853 3.33 2.380 1.022960.032276 3.16 3.95E−06 chr13 1.56771 0.014258 0.91 2.47E−15 chr140.96966 0.034017 3.51 2.233 1.50951 0.05009 3.32 8.24E−06 chr15 0.996730.053512 5.37 1.888 1.54618 0.077547 5.02 0.002925 chr16 0.951690.080007 8.41 1.613 1.46673 0.117073 7.98 0.114232 chr17 0.985470.091918 9.33 1.484 1.51571 0.132775 8.76 0.188271 chr18 1.131240.040032 3.54 2.312 1.74146 0.072447 4.16 0.001674 chr19 1.416240.174476 12.32 1.306 2.16586 0.252888 11.68 0.460752 chr20 1.177050.094807 8.05 1.695 1.81576 0.137494 7.57 0.08801 chr21 2.33660 0.1313175.62 1.927 3.63243 0.235392 6.48 0.00675 chr22 2.01678 0.243883 12.091.364 3.08943 0.34981 11.32 0.409449 chrX 0.66679 0.028788 4.32 1.114chr2-6 0.46751 0.006762 1.45 4.066 chr3-6 0.50332 0.005161 1.03 5.260chr_tot 1.13209 0.038485 3.40  2.7E−05 Y (n = 26) Avg Stdev CV T TestChr 1-22, X 0.00734 0.002611 30.81 1.8E−12

Examples of diagnoses of T21, T13, T18 and a case of Turner syndromeobtained using the normalizing chromosomes, chromosome doses anddifferentiability for each of the chromosomes of interest are describedin Example 3.

Example 3 Diagnosis of Fetal Aneuploidy Using Normalizing Chromosomes

To apply the use of chromosome doses for assessing aneuploidy in abiological test sample, maternal blood test samples were obtained frompregnant volunteers and cfDNA was prepared, sequenced and analyzed asdescribed in Examples 1 and 2.

Trisomy 21

Table 4 provides the calculated dose for chromosome 21 in an exemplarytest sample (#11403). The calculated threshold for the positivediagnosis of T21 aneuploidy was set at >2 standard deviations from themean of the qualified (normal) samples. A diagnosis for T21 was givenbased on the chromosome dose in the test sample being greater than theset threshold. Chromosomes 14 and 15 were used as normalizingchromosomes in separate calculations to show that either a chromosomehaving the lowest variability e.g. chromosome 14, or a chromosome havingthe greatest differentiability e.g. chromosome 15, can be used toidentify the aneuploidy. Thirteen T21 samples were identified using thecalculated chromosome doses, and the aneuploidy samples were confirmedto be T21 by karyotype.

TABLE 4 Chromosome Dose for a T21 aneuploidy (sample #11403, 47 XY + 21)Sequence Tag Chromosome Chromosome Density Dose for Chr 21 ThresholdChr21 333,660 0.419672 0.412696 Chr14 795,050 Chr21 333,660 0.4410380.433978 Chr15 756,533

Trisomy 18

Table 5 provides the calculated dose for chromosome 18 in a test sample(#11390). The calculated threshold for the positive diagnosis of T18aneuploidy was set at 2 standard deviations from the mean of thequalified (normal) samples. A diagnosis for T18 was given based on thechromosome dose in the test sample being greater than the set threshold.Chromosome 8 was used as the normalizing chromosome. In this instancechromosome 8 had the lowest variability and the greatestdifferentiability. Eight T18 samples were identified using chromosomedoses, and were confirmed to be T18 by karyotype.

These data show that a normalizing chromosome can have both the lowestvariability and the greatest differentiability.

TABLE 5 Chromosome Dose for a T18 aneuploidy (sample #11390, 47 XY + 18)Sequence Tag Chromosome Chromosome Density Dose for Chr 18 ThresholdChr18 602,506 0.585069 0.530867 Chr8 1,029,803

Trisomy 13

Table 6 provides the calculated dose for chromosome 13 in a test sample(#51236). The calculated threshold for the positive diagnosis of T13aneuploidy was set at 2 standard deviations from the mean of thequalified samples. A diagnosis for T13 was given based on the chromosomedose in the test sample being greater than the set threshold. Thechromosome dose for chromosome 13 was calculated using either chromosome5 or the group of chromosomes 3, 4, 5, and 6 as the normalizingchromosome. One T13 sample was identified.

TABLE 6 Chromosome Dose for a T13 aneuploidy (sample #51236, 47 XY + 13)Sequence Tag Chromosome Chromosome Density Dose for Chr 13 ThresholdChr13 692,242 0.541343 0.52594 Chr5 1,278,749 Chr13 692,242 0.5304720.513647 Chr3-6 1,304,954 [average]

The sequence tag density for chromosomes 3-6 is the average tag countsfor chromosomes 3-6.

The data show that the combination of chromosomes 3, 4, 5 and 6 providea variability that is lower than that of chromosome 5, and the greatestdifferentiability than any of the other chromosomes.

Thus, a group of chromosomes can be used as the normalizing chromosometo determine chromosome doses and identify aneuploidies.

Turner Syndrome (Monosomy X)

Table 7 provides the calculated dose for chromosomes X and Yin a testsample (#51238). The calculated threshold for the positive diagnosis ofTurner Syndrome (monosomy X) was set for the X chromosome at <−2standard deviations from the mean, and for the absence of the Ychromosome at <−2 standard deviations from the mean for qualified(normal) samples.

TABLE 7 Chromosome Dose for a Turners (XO) aneuploidy (sample #51238, 45X) Chromosome Sequence Tag Dose for Chr X Chromosome Density and Chr YThreshold ChrX 873,631 0.786642 0.803832 Chr4 1,110,582 ChrY 1,3210.001542101 0.00211208 Chr_Total 856,623.6 (1-22, X) (Average)

A sample having an X chromosome dose less than that of the set thresholdwas identified as having less than one X chromosome. The same sample wasdetermined to have a Y chromosome dose that was less than the setthreshold, indicating that the sample did not have a Y chromosome. Thus,the combination of chromosome doses for X and Y were used to identifythe Turner Syndrome (monosomy X) samples.

Thus, the method provided enables for the determination of CNV ofchromosomes. In particular, the method enables for the determination ofover- and under-representation chromosomal aneuploidies by massivelyparallel sequencing of maternal plasma cfDNA and identification ofnormalizing chromosomes for the statistical analysis of the sequencingdata. The sensitivity and reliability of the method allow for accuratefirst and second trimester aneuploidy testing.

Example 4 Determination of Partial Aneuploidy

The use of sequence doses was applied for assessing partial aneuploidyin a biological test sample of cfDNA that was prepared from bloodplasma, and sequenced as described in Example 1. The sample wasconfirmed by karyotyping to have been derived from a subject with apartial deletion of chromosome 11.

Analysis of the sequencing data for the partial aneuploidy (partialdeletion of chromosome 11 i.e. q21-q23) was performed as described forthe chromosomal aneuploidies in the previous examples. Mapping of thesequence tags to chromosome 11 in a test sample revealed a noticeableloss of tag counts between base pairs 81000082-103000103 in the q arm ofthe chromosome relative to the tag counts obtained for correspondingsequence on chromosome 11 in the qualified samples (data not shown).Sequence tags mapped to the sequence of interest on chromosome 11(810000082-103000103 bp) in each of the qualified samples, and sequencetags mapped to all 20 megabase segments in the entire genome in thequalified samples i.e. qualified sequence tag densities, were used todetermine qualified sequence doses as ratios of tag densities in allqualified samples. The average sequence dose, standard deviation, andcoefficient of variation were calculated for all 20 megabase segments inthe entire genome, and the 20-megabase sequence having the leastvariability was the identified normalizing sequence on chromosome 5(13000014-33000033 bp) (See Table 8), which was used to calculate thedose for the sequence of interest in the test sample (see Table 9).Table 8 provides the sequence dose for the sequence of interest onchromosome 11 (810000082-103000103 bp) in the test sample that wascalculated as the ratio of sequence tags mapped to the sequence ofinterest and the sequence tags mapped to the identified normalizingsequence. FIG. 10 shows the sequence doses for the sequence of interestin the 7 qualified samples (O) and the sequence dose for thecorresponding sequence in the test sample (O). The mean is shown by thesolid line, and the calculated threshold for the positive diagnosis ofpartial aneuploidy that was set 5 standard deviations from the mean isshown by the dashed line. A diagnosis for partial aneuploidy was basedon the sequence dose in the test sample being less than the setthreshold. The test sample was verified by karyotyping to have deletionq21-q23 on chromosome 11.

Therefore, in addition to identifying chromosomal aneuploidies, themethod of the invention can be used to identify partial aneuploidies.

TABLE 8 Qualified Normalizing Sequence, Dose and Variance for SequenceChr11: 81000082-103000103 (qualified samples n = 7) Chr11:81000082-103000103 Avg Stdev CV Chr5: 13000014- 1.164702 0.004914 0.4233000033

TABLE 9 Sequence Dose for Sequence of Interest (81000082- 103000103) onChromosome 11 (test sample 11206) Chromosome Segment Chromosome SequenceTag Dose for Chr11 (q21- Segment Density q23) Threshold Chr11: 81000082-27,052 1.0434313 1.1401347 103000103 Chr5: 13000014- 25,926 33000033

Example 5 Demonstration of Detection of Aneuploidy

Sequencing data obtained for the samples described in Examples 2 and 3,and shown in FIGS. 2-6 were further analyzed to illustrate thesensitivity of the method in successfully identifying aneuploidies inmaternal samples. Normalized chromosome doses for chromosomes 21, 18,13× and Y were analyzed as a distribution relative to the standarddeviation of the mean (Y-axis) and shown in FIG. 11. The normalizingchromosome used is shown as the denominator (X-axis).

FIG. 11 (A) shows the distribution of chromosome doses relative to thestandard deviation from the mean for chromosome 21 dose in theunaffected samples (o) and the trisomy 21 samples (T21; Δ) when usingchromosome 14 as the normalizing chromosome for chromosome 21. FIG. 11(B) shows the distribution of chromosome doses relative to the standarddeviation from the mean for chromosome 18 dose in the unaffected samples(o) and the trisomy 18 samples (T18; Δ) when using chromosome 8 as thenormalizing chromosome for chromosome 18. FIG. 11 (C) shows thedistribution of chromosome doses relative to the standard deviation fromthe mean for chromosome 13 dose in the unaffected samples (o) and thetrisomy 13 samples (T13; A), using the average sequence tag density ofthe group of chromosomes 3, 4, 5, and 6 as the normalizing chromosome todetermine the chromosome dose for chromosome 13. FIG. 11 (D) shows thedistribution of chromosome doses relative to the standard deviation fromthe mean for chromosome X dose in the unaffected female samples (o), theunaffected male samples (Δ), and the monosomy X samples (XO; +) whenusing chromosome 4 as the normalizing chromosome for chromosome X. FIG.11 (E) shows the distribution of chromosome doses relative to thestandard deviation from the mean for chromosome Y dose in the unaffectedmale samples (o the unaffected female sample s (Δ), and the monosomy Xsamples (+), when using the average sequence tag density of the group ofchromosomes 1-22 and X as the normalizing chromosome to determine thechromosome dose for chromosome Y.

The data show that trisomy 21, trisomy 18, trisomy 13 were clearlydistinguishable from the unaffected (normal) samples. The monosomy Xsamples were easily identifiable as having chromosome X dose that wereclearly lower than those of unaffected female samples (FIG. 11 (D)), andas having chromosome Y doses that were clearly lower than that of theunaffected male samples (FIG. 11 (E)).

Therefore the method provided is sensitive and specific for determiningthe presence or absence of chromosomal aneuploidies in a maternal bloodsample.

While preferred embodiments of the present invention have been shown anddescribed herein, it will be obvious to those skilled in the art thatsuch embodiments are provided by way of example only. Numerousvariations, changes, and substitutions will now occur to those skilledin the art without departing from the invention. It should be understoodthat various alternatives to the embodiments of the invention describedherein may be employed in practicing the invention. It is intended thatthe following claims define the scope of the invention and that methodsand structures within the scope of these claims and their equivalents becovered thereby.

1. A method for identifying fetal trisomy 21, said method comprising thesteps: obtaining sequence information for a plurality of fetal andmaternal nucleic acid molecules of a maternal blood sample; using thesequence information to identify a number of mapped sequence tags forchromosome 21; using the sequence information to identify a number ofmapped sequence tags for at least one normalizing chromosome; using thenumber of mapped sequence tags identified for chromosome 21 in step (b)and the number of mapped sequence tags identified for the at least onenormalizing chromosome in step (c) to calculate a chromosome dose forchromosome 21; and comparing said chromosome dose to at least onethreshold value, and thereby identifying the presence or absence offetal trisomy
 21. 2. The method of claim 1, further comprisingsequencing at least a portion of said nucleic acid molecules to obtainsequence information for a plurality of fetal and maternal nucleic acidmolecules of a maternal blood sample.
 3. The method of claim 1 or claim2, wherein step (d) comprises calculating a chromosome dose forchromosome 21 as the ratio of the number of mapped sequence tagsidentified for chromosome 21 and the number of mapped sequence tagsidentified for the at least one normalizing chromosome.
 4. The method ofclaim 1 or claim 2, wherein step (d) comprises: calculating a sequencetag density ratio for chromosome 21, by relating the number of mappedsequence tags identified for chromosome 21 in step (b) to the length ofchromosome 21; calculating a sequence tag density ratio for said atleast one normalizing chromosome, by relating the number of mappedsequence tags identified for said at least one normalizing chromosome instep (c) to the length of said at least one normalizing chromosome; andusing the sequence tag density ratios calculated in steps (i) and (ii)to calculate a chromosome dose for chromosome 21, wherein the chromosomedose is calculated as the ratio of the sequence tag density ratio forchromosome 21 and the sequence tag density ratio for said at least onenormalizing chromosome.
 5. The method of claim 1 or claim 2, whereinsaid at least one normalizing chromosome is a chromosome having thesmallest variability and/or the greatest differentiability.
 6. Themethod of claim 1 or claim 2, wherein said at least one normalizingchromosome is selected from chromosome 9, chromosome 1, chromosome 2,chromosome 11, chromosome 12, and chromosome
 14. 7. The method of claim1 or claim 2, wherein said at least one normalizing chromosome is agroup of chromosomes selected from chromosome 9, chromosome 1,chromosome 2, chromosome 11, chromosome 12, and chromosome
 14. 8. Themethod of claim 1 or claim 2, wherein said fetal and maternal nucleicacid molecules are cell-free DNA molecules.
 9. The method of claim 1 orclaim 2, wherein said sequencing is next generation sequencing (NGS).10. The method of claim 1 or claim 2, wherein said sequencing ismassively parallel sequencing using sequencing-by-synthesis withreversible dye terminators.
 11. The method of claim 1 or claim 2,wherein said sequencing is sequencing-by-ligation.
 12. The method ofclaim 1 or claim 2, wherein said sequencing comprises an amplification.13. The method of claim 1 or claim 2, wherein said sequencing is singlemolecule sequencing.
 14. A method for identifying fetal trisomy 18, saidmethod comprising the steps: obtaining sequence information for aplurality of fetal and maternal nucleic acid molecules of a maternalblood sample; using the sequence information to identify a number ofmapped sequence tags for chromosome 18; using the sequence informationto identify a number of mapped sequence tags for at least onenormalizing chromosome; using the number of mapped sequence tagsidentified for chromosome 18 in step (b) and the number of mappedsequence tags identified for the at least one normalizing chromosome instep (c) to calculate a chromosome dose for chromosome 18; and comparingsaid chromosome dose to at least one threshold value, and therebyidentifying the presence or absence of fetal trisomy 18;
 15. The methodof claim 14, further comprising sequencing at least a portion of saidnucleic acid molecules, to obtain sequence information for a pluralityof fetal and maternal nucleic acid molecules of a maternal blood sample.16. The method of claim 14 or claim 15, wherein step (d) comprisescalculating a chromosome dose for chromosome 18 as the ratio of thenumber of mapped sequence tags identified for chromosome 18 and thenumber of mapped sequence tags identified for the at least onenormalizing chromosome.
 17. The method of claim 14 or claim 15, whereinstep (d) comprises: calculating a sequence tag density ratio forchromosome 18, by relating the number of mapped sequence tags identifiedfor chromosome 18 in step (b) to the length of chromosome 18;calculating a sequence tag density ratio for said at least onenormalizing chromosome, by relating the number of mapped sequence tagsidentified for said at least one normalizing chromosome in step (c) tothe length of said at least one normalizing chromosome; and using thesequence tag density ratios calculated in steps (i) and (ii) tocalculate a chromosome dose for chromosome 18, wherein the chromosomedose is calculated as the ratio of the sequence tag density ratio forchromosome 18 and the sequence tag density ratio for said at least onenormalizing chromosome.
 18. The method of claim 14 or claim 15, whereinsaid at least one normalizing chromosome is a chromosome having thesmallest variability and/or the greatest differentiability.
 19. Themethod of claim 14 or claim 15, wherein said at least one normalizingchromosome is selected from chromosome 8, chromosome 2, chromosome 3,chromosome 5, chromosome 6, chromosome 12, and chromosome
 14. 20. Themethod of claim 14 or claim 15, wherein said at least one normalizingchromosome is a group of chromosomes selected from chromosome 8,chromosome 2, chromosome 3, chromosome 5, chromosome 6, chromosome 12,and chromosome
 14. 21. The method of claim 14 or claim 15, wherein saidfetal and maternal nucleic acid molecules are cell-free DNA molecules.22. The method of claim 14 or claim 15, wherein said sequencing is nextgeneration sequencing (NGS);
 23. The method of claim 14 or claim 15,wherein said sequencing is massively parallel sequencing usingsequencing-by-synthesis with reversible dye terminators.
 24. The methodof claim 14 or claim 15, wherein said sequencing issequencing-by-ligation.
 25. The method of claim 14 or claim 15, whereinsaid sequencing comprises an amplification.
 26. The method of claim 14or claim 15, wherein said sequencing is single molecule sequencing. 27.A method for identifying fetal trisomy 13, said method comprising thesteps: obtaining sequence information for a plurality of fetal andmaternal nucleic acid molecules of a maternal blood sample; using thesequence information to identify a number of mapped sequence tags forchromosome 13; using the sequence information to identify a number ofmapped sequence tags for at least one normalizing chromosome; using thenumber of mapped sequence tags identified for chromosome 13 in step (b)and the number of mapped sequence tags identified for the at least onenormalizing chromosome in step (c) to calculate a chromosome dose forchromosome 13; and comparing said chromosome dose to at least onethreshold value, and thereby identifying the presence or absence offetal trisomy 13;
 28. The method of claim 27, further comprisingsequencing at least a portion of said nucleic acid molecules, to obtainsequence information for a plurality of fetal and maternal nucleic acidmolecules of a maternal blood sample.
 29. The method of claim 27 orclaim 28, wherein step (d) comprises calculating a chromosome dose forchromosome 13 as the ratio of the number of mapped sequence tagsidentified for chromosome 13 and the number of mapped sequence tagsidentified for the at least one normalizing chromosome.
 30. The methodof claim 27 or claim 28, wherein step (d) comprises: calculating asequence tag density ratio for chromosome 13, by relating the number ofmapped sequence tags identified for chromosome 13 in step (b) to thelength of chromosome 13; calculating a sequence tag density ratio forsaid at least one normalizing chromosome, by relating the number ofmapped sequence tags identified for said at least one normalizingchromosome in step (c) to the length of said at least one normalizingchromosome; and using the sequence tag density ratios calculated insteps (i) and (ii) to calculate a chromosome dose for chromosome 13,wherein the chromosome dose is calculated as the ratio of the sequencetag density ratio for chromosome 13 and the sequence tag density ratiofor said at least one normalizing chromosome.
 31. The method of claim 27or claim 28, wherein said at least one normalizing chromosome is achromosome having the smallest variability and/or the greatestdifferentiability.
 32. The method of claim 27 or claim 28, wherein saidat least one normalizing chromosome is selected from chromosome 2,chromosome 3, chromosome 4, chromosome 5, chromosome 6, and chromosome8.
 33. The method of claim 27 or claim 28, wherein said at least onenormalizing chromosome is a group of chromosomes selected fromchromosome 2, chromosome 3, chromosome 4, chromosome 5, chromosome 6,and chromosome
 8. 34. The method of claim 27 or claim 28, wherein saidfetal and maternal nucleic acid molecules are cell-free DNA molecules.35. The method of claim 27 or claim 28, wherein said sequencing is (i)next generation sequencing (NGS).
 36. The method of claim 27 or claim28, wherein said sequencing is massively parallel sequencing usingsequencing-by-synthesis with reversible dye terminators.
 37. The methodof claim 27 or claim 28, wherein said sequencing issequencing-by-ligation.
 38. The method of claim 27 or claim 28, whereinsaid sequencing comprises an amplification.
 39. The method of claim 27or claim 28, wherein said sequencing is single molecule sequencing. 40.A method for identifying fetal monosomy X, said method comprising thesteps: obtaining sequence information for a plurality of fetal andmaternal nucleic acid molecules of a maternal blood sample; using thesequence information to identify a number of mapped sequence tags forchromosome X; using the sequence information to identify a number ofmapped sequence tags for at least one normalizing chromosome; using thenumber of mapped sequence tags identified for chromosome X in step (b)and the number of mapped sequence tags identified for the at least onenormalizing chromosome in step (c) to calculate a chromosome dose forchromosome X; and comparing said chromosome dose to at least onethreshold value, and thereby identifying the presence or absence offetal monosomy X.
 41. The method of claim 40, further comprisingsequencing at least a portion of said nucleic acid molecules, to obtainsequence information for a plurality of fetal and maternal nucleic acidmolecules of a maternal blood sample.
 42. The method of claim 40 orclaim 41, wherein step (d) comprises calculating a chromosome dose forchromosome X as the ratio of the number of mapped sequence tagsidentified for chromosome X and the number of mapped sequence tagsidentified for the at least one normalizing chromosome.
 43. The methodof claim 40 or claim 41, wherein step (d) comprises: calculating asequence tag density ratio for chromosome X, by relating the number ofmapped sequence tags identified for chromosome X in step (b) to thelength of chromosome X; calculating a sequence tag density ratio forsaid at least one normalizing chromosome, by relating the number ofmapped sequence tags identified for said at least one normalizingchromosome in step (c) to the length of said at least one normalizingchromosome; and using the sequence tag density ratios calculated insteps (i) and (ii) to calculate a chromosome dose for chromosome X,wherein the chromosome dose is calculated as the ratio of the sequencetag density ratio for chromosome X and the sequence tag density ratiofor said at least one normalizing chromosome.
 44. The method of claim 40or claim 41, wherein said at least one normalizing chromosome is achromosome having the smallest variability and/or the greatestdifferentiability.
 45. The method of claim 40 or claim 41, wherein saidat least one normalizing chromosome is selected from chromosome 2,chromosome 3, chromosome 4, chromosome 5, chromosome 6, and chromosome8.
 46. The method of claim 40 or claim 41, wherein said at least onenormalizing chromosome is a group of chromosomes selected fromchromosome 2, chromosome 3, chromosome 4, chromosome 5, chromosome 6,and chromosome
 8. 47. The method of claim 40 or claim 41, wherein saidfetal and maternal nucleic acid molecules are cell-free DNA molecules.48. The method of claim 40 or claim 41, wherein said sequencing is nextgeneration sequencing (NGS).
 49. The method of claim 40 or claim 41,wherein said sequencing is massively parallel sequencing usingsequencing-by-synthesis with reversible dye terminators.
 50. The methodof claim 40 or claim 41, wherein said sequencing issequencing-by-ligation.
 51. The method of claim 40 or claim 41, whereinsaid sequencing comprises an amplification.
 52. The method of claim 40or claim 41, wherein said sequencing is single molecule sequencing. 53.A method for identifying fetal chromosomal aneuploidy in a test sample,said method comprising: obtaining a test sample and a plurality ofqualified samples, said test sample comprising a test nucleic acidmolecules and said plurality of qualified samples comprising qualifiednucleic acid molecules; sequencing at least a portion of said qualifiedand test nucleic acid molecules, wherein said sequencing comprisesproviding a plurality of mapped sequence tags for a test and a qualifiedchromosome sequence of interest, and for at least one test and at leastone qualified normalizing chromosome; based on said sequencing of saidqualified chromosome, calculating a qualified chromosome dose for saidqualified chromosome of interest in each of said plurality of qualifiedsamples, wherein said calculating a qualified chromosome dose comprisesdetermining a parameter for said qualified chromosome of interest and atleast one qualified normalizing chromosome; based on said qualifiedchromosome dose, identifying at least one qualified normalizingchromosome, wherein said at least one qualified normalizing chromosomehas the smallest variability and/or the greatest differentiability insequence chromosome dose in said plurality of qualified samples; basedon said sequencing of said nucleic acid molecules in said test sample,calculating a test chromosome dose for said test chromosome of interest,wherein said calculating a test chromosome dose comprises determining aparameter for said test chromosome of interest and at least onenormalizing test chromosome, and wherein said at least one normalizingtest chromosome corresponds to said at least one normalizing chromosomesequence; comparing said test chromosome dose to at least one thresholdvalue; and determining said fetal aneuploidy based on the outcome ofstep (f).
 54. The method of claim 53, wherein said parameter for saidqualified chromosome of interest and at least one qualified normalizingchromosome relates the number of sequence tags mapped to said qualifiedchromosome of interest to the number of tags mapped to said normalizingchromosome sequence, and wherein said parameter for said test chromosomeof interest and at least one normalizing test chromosome relates thenumber of sequence tags mapped to said test chromosome of interest tothe number of tags mapped to said normalizing chromosome sequence. 55.The method of claim 53, wherein said test and qualified sample issubstantially cell-free biological sample.
 56. The method of claim 53,wherein said sample is chosen from a maternal blood, plasma, serum,urine and saliva.
 57. The method of claim 53, wherein said sample is amaternal plasma sample.
 58. The method of claim 53, wherein said fetaland maternal nucleic acid molecules are cell-free DNA molecules.
 59. Themethod of claim 53, wherein said sequencing is next generationsequencing (NGS).
 60. The method of claim 53, wherein said sequencing ismassively parallel sequencing using sequencing-by-synthesis withreversible dye terminators.
 61. The method of claim 53, wherein saidsequencing is sequencing-by-ligation.
 62. The method of claim 53,wherein said step of sequencing comprises an amplification.
 63. Themethod of claim 53, wherein said sequencing is single moleculesequencing.
 64. The method of claim 53, wherein said chromosomalaneuploidy is chosen from trisomy 8, trisomy 13, trisomy 15, trisomy 16,trisomy 18, trisomy 21, trisomy 22, monosomy X, and XXX.
 65. The methodof claim 53, wherein said chromosome of interest is chosen fromchromosome 8, chromosome 13, chromosome 15, chromosome 16, chromosome18, chromosome 21, chromosome 22, and chromosome X.
 66. A method foridentifying copy number variation (CNV) of a sequence of interest in atest sample comprising the steps of: obtaining a test sample and aplurality of qualified samples, said test sample comprising test nucleicacid molecules and said plurality of qualified samples comprisingqualified nucleic acid molecules; sequencing at least a portion of saidqualified and test nucleic acid molecules, wherein said sequencingcomprises providing a plurality of mapped sequence tags for a test and aqualified sequence of interest, and for at least one test and at leastone qualified normalizing sequence; based on said sequencing of saidqualified nucleic acid molecules, calculating a qualified sequence dosefor said qualified sequence of interest in each of said plurality ofqualified samples, wherein said calculating a qualified sequence dosecomprises determining a parameter for said qualified sequence ofinterest and at least one qualified normalizing sequence; based on saidqualified sequence dose, identifying at least one qualified normalizingsequence, wherein said at least one qualified normalizing sequence hasthe smallest variability and/or the greatest differentiability insequence dose in said plurality of qualified samples; based on saidsequencing of said nucleic acid molecules in said test sample,calculating a test sequence dose for said test sequence of interest,wherein said calculating a test sequence dose comprises determining aparameter for said test sequence of interest and at least onenormalizing test sequence, and wherein said at least one normalizingtest sequence corresponds to said at least one qualified normalizingsequence; comparing said test sequence dose to at least one thresholdvalue; and identifying said copy number variation of said sequence ofinterest in said test sample based on the outcome of step (f).
 67. Themethod of claim 66, wherein said parameter for said qualified sequenceof interest and at least one qualified normalizing sequence relates thenumber of sequence tags mapped to said qualified sequence of interest tothe number of tags mapped to said qualified normalizing sequence, andwherein said parameter for said test sequence of interest and said atleast one normalizing test sequence relates the number of sequence tagsmapped to said test sequence of interest to the number of tags mapped tosaid normalizing test sequence.
 68. The method of claim 66, wherein saidsequencing is next generation sequencing (NGS).
 69. The method of claim66, wherein said sequencing is massively parallel sequencing usingsequencing-by-synthesis with reversible dye terminators.
 70. The methodof claim 66, wherein said sequencing is sequencing-by-ligation.
 71. Themethod of claim 66, wherein said sequencing comprises an amplification.72. The method of claim 55, wherein said sequencing is single moleculesequencing.
 73. The method of claim 66, wherein said CNV of a sequenceof interest is an aneuploidy.
 74. The method of claim 66, wherein saidaneuploidy is a chromosomal or a partial aneuploidy.
 75. The method ofclaim 74, wherein said chromosomal aneuploidy selected from trisomy 8,trisomy 13, trisomy 15, trisomy 16, trisomy 18, trisomy 21, trisomy 22,monosomy X, and XXX.
 76. The method of claim 74, wherein said partialaneuploidy is a partial deletion or a partial insertion.
 77. The methodof claim 66, wherein said test and qualified samples are biologicalfluid samples.
 78. The method of claim 66, wherein said samples areplasma samples.
 79. The method of claim 66, wherein said test andqualified sample is a plasma sample obtained from a pregnant humansubject.
 80. The method of claim 66, wherein said test and qualifiedsample is a plasma sample obtained from a subject that is known or issuspected of having cancer.
 81. The method of claim 1, 14, 27, or 40,wherein said maternal blood sample is a plasma sample.